Spatially addressable molecular barcoding

ABSTRACT

The disclosure provides for methods, compositions, systems, devices, and kits for determining the number of distinct targets in distinct spatial locations within a sample. In some examples, the methods include: stochastically barcoding the plurality of targets in the sample using a plurality of stochastic barcodes, wherein each of the plurality of stochastic barcodes comprises a spatial label and a molecular label; estimating the number of each of the plurality of targets using the molecular label; and identifying the spatial location of each of the plurality of targets using the spatial label. The method can be multiplexed.

RELATED APPLICATIONS

The present application is a reissue of U.S. Pat. No. 9,727,810, issuedon Aug. 8, 2017 from U.S. patent application Ser. No. 15/055,445, filedon Feb. 26, 2016, which claims priority under 35 U.S.C. §119(e) to U.S.Provisional Application No. 62/126,230, filed on Feb. 27, 2015, and U.S.Provisional Application No. 62/162,471, filed on May 15, 2015. Thecontent of each of these related applications and the patent is hereinexpressly incorporated by reference in its entirety.

BACKGROUND

Field

The present disclosure relates generally to the field of molecularbiology and more particularly to molecular barcoding.

Description of the Related Art

Methods and techniques such as in situ hybridization andimmunohistochemistry allow the visualization of the locations of targetmolecules within the sample. Methods and techniques for labeling targetmolecules for amplification and sequencing, for example stochasticbarcoding, are useful for determining the identities of the targetmolecules. Determining the identities and locations of the targetsmolecules in the sample is important for clinical applications,diagnostics, and biomedical research. Thus, there is a need for methodsand techniques capable of correlating the identities of the targetmolecules with the locations of target molecules within the sample.

SUMMARY

Disclosed herein are methods for determining the number and spatiallocations of a plurality of targets in a sample. In some embodiment, themethods include: stochastically barcoding the plurality of targets inthe sample using a plurality of stochastic barcodes, wherein each of theplurality of stochastic barcodes comprises a spatial label and amolecular label; estimating the number of each of the plurality oftargets using the molecular label; and identifying the spatial locationof each of the plurality of targets using the spatial label. The methodcan be multiplexed.

In some embodiments, stochastically barcoding the plurality of targetsin the sample can include hybridizing the plurality of stochasticbarcodes with the plurality of targets to generate stochasticallybarcoded targets, and at least one of the plurality of targets ishybridized to one of the plurality of stochastic barcodes.Stochastically barcoding the plurality of targets in the sample caninclude comprises generating an indexed library of the stochasticallybarcoded targets. The molecular labels of different stochastic barcodescan be different from one another. The sample can be physically dividedor is intact during stochastically barcoding the plurality of targets inthe sample. The spatial locations of the plurality of targets in thesample can be on a surface of the sample, inside the sample,subcellularly in the sample, or any combination thereof. Stochasticbarcoding the plurality of targets in the sample can be performed on thesurface of the sample, subcellularly in the sample, inside the sample,or any combination thereof.

In some embodiments, the spatial label can include 5-20 nucleotides. Themolecular label can include 5-20 nucleotides. Estimating the number ofthe plurality of targets using the molecular label can includedetermining sequences of the spatial labels and molecular labels of theplurality of the stochastic labels and counting the number of themolecular labels with distinct sequences. Determining the sequences ofthe spatial labels and the molecular labels of the plurality of thestochastic barcodes can include sequencing some or all of the pluralityof stochastic barcodes. Sequencing some or all of the plurality ofstochastic barcodes can include generating sequences each with a readlength of 100 or more bases. Identifying the spatial locations of theplurality of targets can include correlating the spatial labels of theplurality of the stochastic barcodes with the spatial locations of theplurality of targets in the sample.

In some embodiments, the methods can include comprising visualizing theplurality of targets in the sample. Visualizing the plurality of targetsin the sample can include mapping the plurality of targets onto a map ofthe sample. Mapping the plurality of targets onto the map of the samplecan include generating a two dimensional map or a three dimensional mapof the sample. The two dimensional map and the three dimensional map canbe generated prior to or after stochastically barcoding the plurality oftargets in the sample. In some embodiments, the two dimensional map andthe three dimensional map can be generated before or after lysing thesample. Lysing the sample before or after generating the two dimensionalmap or the three dimensional map can include heating the sample,contacting the sample with a detergent, changing the pH of the sample,or any combination thereof.

In some embodiments, the sample can include a plurality of cells and theplurality of targets can be associated with the plurality of cells. Theplurality of cells can include one or more cell types. At least one ofthe one or more cell types can be brain cell, heart cell, cancer cell,circulating tumor cell, organ cell, epithelial cell, metastatic cell,benign cell, primary cell, circulatory cell, or any combination thereof.The plurality of targets can include ribonucleic acids (RNAs), messengerRNAs (mRNAs), microRNAs, small interfering RNAs (siRNAs), RNAdegradation products, RNAs each comprising a poly(A) tail, and anycombination thereof. Stochastically barcoding the plurality of targetsin the sample can be performed with a solid support including theplurality of stochastic barcodes. In some embodiments, the methods caninclude decoding the solid support. The solid support can include aplurality of synthetic particles associated with the plurality ofstochastic barcodes. The spatial labels of the plurality of stochasticbarcodes on different solid supports can differ by at least onenucleotide.

In some embodiments, each of the plurality of stochastic barcodes caninclude one or more of a universal label and a cellular label, whereinuniversal labels can be the same for the plurality of stochasticbarcodes on the solid support and cellular labels can be the same forthe plurality of stochastic barcodes on the solid support. The universallabel can include 5-20 nucleotides. The cellular label can include 5-20nucleotides. The solid support can include the plurality of stochasticbarcodes in two dimensions or three dimensions. The synthetic particlescan be beads. The beads can be silica gel beads, controlled pore glassbeads, magnetic beads, Dynabeads, Sephadex/Sepharose beads, cellulosebeads, polystyrene beads, or any combination thereof. Solid support caninclude a polymer, a matrix, a hydrogel, a needle array device, anantibody, or any combination thereof.

Disclosed herein are methods for determining spatial locations of aplurality of targets in a sample. In some embodiments, the methodsinclude: stochastically barcoding the plurality of targets in the sampleat one or more time points using a plurality of stochastic barcodes,wherein each of the plurality of stochastic barcodes comprises a spatiallabel; and identifying the spatial location of each of the plurality oftargets using the spatial label.

In some embodiments, stochastically barcoding the plurality of targetsin the sample using the plurality of stochastic barcodes can includestochastically barcoding the plurality of targets in the sample atdifferent time points using the plurality of stochastic barcodes. Eachof the plurality of stochastic barcodes can include a dimension label,and the dimension labels of the plurality of stochastic barcodes usedfor stochastic barcoding the plurality of targets at the different timepoints can be different. The dimension labels can correlate with thedifferent time points.

In some embodiments, stochastically barcoding the plurality of targetsin the sample can include contacting the sample with a device. Thedevice can be a needle, a needle array, a tube, a suction device, aninjection device, an electroporation device, a fluorescent activatedcell sorter device, a microfluidic device, or any combination thereof.The device can contact sections of the sample at a specified rate. Thespecified rate can correlate the spatial locations of the plurality oftargets with the one or more time points. Stochastically barcoding theplurality of targets in the sample can be performed with a solid supportincluding a plurality of synthetic particles associated with theplurality of stochastic barcodes.

Disclosed herein are synthetic particles. In some embodiments, eachsynthetic particle, include: a plurality of stochastic barcodes, whereineach of the plurality of stochastic barcodes comprises a cellular labeland a molecular label; a first group of optical labels; a second groupof optical labels, wherein each optical label in the first group ofoptical labels comprises a first optical moiety and each optical labelin the second group of optical labels comprises a second optical moiety,and wherein each of the plurality of synthetic particles is associatedwith an optical barcode comprising the first optical moiety and thesecond optical moiety.

In some embodiments, the molecular labels of the plurality of stochasticbarcodes are different from one another, and the molecular labels areselected from a group comprising at least 100 molecular labels withunique sequences. The cellular labels of the plurality of stochasticbarcodes can be same. The first optical moiety and the second opticalmoiety are selected from a group comprising two or morespectrally-distinct optical moieties. Each of the plurality ofstochastic barcodes can include a spatial label, wherein the spatiallabels of the plurality of stochastic barcodes differ from one anotherby at least one nucleotide.

In some embodiments, each of the plurality of stochastic barcodesfurther comprises a universal label, wherein universal labels of allstochastic barcodes on the particle are the same. The synthetic particlecan be a bead or a magnetic bead. The bead can be a silica gel bead, acontrolled pore glass beads, a magnetic beads, a Dynabead, aSephadex/Sepharose bead, a cellulose beads, a polystyrene bead, or anycombination thereof.

Disclosed herein are methods for determining spatial locations of aplurality of targets in a sample. In some embodiments, the methodsinclude: stochastically barcoding the plurality of targets in the sampleusing a plurality of stochastic barcodes, wherein each of the pluralityof stochastic barcodes comprises a pre-spatial label; concatenating oneor more spatial label blocks onto the pre-spatial label to generate aspatial label; and identifying the spatial location of each of theplurality of targets using the spatial label.

In some embodiments, stochastically barcoding the plurality of targetsin the sample can include hybridizing the plurality of stochasticbarcodes with the plurality of targets to generate stochasticallybarcoded targets, and at least one of the plurality of targets ishybridized to one of the plurality of stochastic barcodes.Stochastically barcoding the plurality of targets in the sample caninclude generating an indexed library of the stochastically barcodedtargets. The spatial label can include 5-20 nucleotides. The sample caninclude a plurality of cells and the plurality of targets that can beassociated with the plurality of cells. The plurality of targets caninclude ribonucleic acids (RNAs), messenger RNAs (mRNAs), microRNAs,small interfering RNAs (siRNAs), RNA degradation products, RNAs eachcomprising a poly (A) tail, and any combination thereof. Stochasticallybarcoding the plurality of targets in the sample can be performed with asolid support comprising the plurality of stochastic barcodes. In someembodiments, the methods can include decoding the solid support. Thesolid support can include a plurality of synthetic particles associatedwith the plurality of stochastic barcodes. The synthetic particles canbe beads.

Disclosed herein can be methods for determining spatial locations of aplurality of targets in a sample. In some embodiments, the methodsinclude: imaging the sample to generate a sample image; stochasticallybarcoding the plurality of targets in the sample using a plurality ofstochastic barcodes to generate stochastically barcoded targets, whereineach of the plurality of stochastic barcodes can include a spatiallabel; and identifying the spatial location of each of the plurality oftargets using the spatial label.

In some embodiments, identifying the spatial location of each of theplurality of targets using the spatial label can include correlating thesample image with the spatial labels of the plurality of targets in thesample. Imaging the sample can include staining the sample with a stain,wherein the stain can be a fluorescent stain, a negative stain, anantibody stain, or any combination thereof. Imaging the sample caninclude imaging the sample using optical microscopy, electronmicroscopy, confocal microscopy, fluorescence microscopy, or anycombination thereof.

In some embodiments, the sample can include a tissue, a cell monolayer,fixed cells, a tissue section, or any combination thereof. Correlatingthe sample image with the spatial labels of the plurality of targets inthe sample can include overlaying the sample image with the spatiallabels of the plurality of targets in the sample. The sample can includea biological sample, a clinical sample, an environmental sample, abiological fluid, a tissue, or a cell from a subject. The subject can bea human, a mouse, a dog, a rat, or a vertebrate.

In some embodiments, the methods can include determining genotype,phenotype, or one or more genetic mutations of the subject based on thespatial labels of the plurality of targets in the sample. In someembodiments, the methods can include predicting susceptibility of thesubject to one or more diseases. At least one of the one or morediseases can be cancer or a hereditary disease. The sample can include aplurality of cells and the plurality of targets can be associated withthe plurality of cells. The plurality of cells can include one or morecell types. In some embodiments, the methods can include determiningcell types of the plurality of cells in the sample. The drug can bechosen based on predicted responsiveness of the cell types of theplurality of cells in the sample.

Disclosed herein are methods for determining spatial locations of aplurality of singles cells. In some embodiments, the methods caninclude: stochastically barcoding the plurality of singe cells using aplurality of synthetic particles, wherein each of the plurality ofsynthetic particles can include a plurality of stochastic barcodes, afirst group of optical labels, and a second group of optical labels,wherein each of the plurality of stochastic barcodes can include acellular label and a molecular label, wherein each optical label in thefirst group of optical labels can include a first optical moiety andeach optical label in the second group of optical labels can include asecond optical moiety, and wherein each of the plurality of syntheticparticles can be associated with an optical barcode including the firstoptical moiety and the second optical moiety; detecting the opticalbarcode of each of the plurality of synthetic particles to determine thelocation of each of the plurality of synthetic particles; anddetermining the spatial locations of the plurality of single cells basedon the locations of the plurality of synthetic particles.

In some embodiments, stochastically barcoding the plurality of singlecells using the plurality of synthetic particles can include contactingthe plurality of single cells with the plurality of synthetic particles,and each of the plurality of synthetic particles can be in closeproximity to a single cell or a small number of cells. Each of theplurality of single cells can include a plurality of targets, andstochastically barcoding the plurality of single cells can includehybridizing the plurality of stochastic barcodes with the plurality oftargets to generate stochastically barcoded targets, and at least one ofthe plurality of targets can be hybridized to one of the plurality ofstochastic barcodes.

In some embodiments, the cellular labels of the plurality of stochasticbarcodes on one synthetic particle can have the same sequence and thecellular labels of the plurality of stochastic barcodes on differentsynthetic particles can have different sequences. The molecular labelsof the plurality of stochastic barcodes on one synthetic barcode can bedifferent from one another, and the molecular labels can be selectedfrom a group including at least 100 molecular labels with uniquesequences. The first optical moiety and the second optical moiety can beselected from a group including two or more spectrally-distinct opticalmoieties. Determining the optical barcodes of the plurality of syntheticparticles and determining the optical barcodes of the plurality ofsynthetic particles can include generating an optical image showing theoptical barcodes and the locations of the plurality of syntheticparticles.

In some embodiments, the plurality of single cells can include cellsdistributed across a well array including wells, and each of a majorityof the wells in the well array contains at most one single cell. In someembodiments, the methods can include lysing the plurality of singlecells; and generating an indexed library of stochastically barcodedtargets, wherein each of the stochastically barcoded targets can includea cellular label sequence, a molecular label sequence, and at least aportion of the complementary sequence of one of the plurality oftargets. The methods can include amplifying the stochastically barcodedtargets of the indexed library to generate amplified stochasticallybarcoded targets; and sequencing the amplified stochastically barcodedtargets to determine the number of amplified stochastically barcodedtargets with unique molecular label sequences and identicalcomplementary sequence, wherein the number of amplified stochasticallybarcoded targets with unique molecular label sequences and identicalcomplementary sequence can be substantially the same as the occurrencesof targets with sequences complementary of the identical complementarysequence in the single cell or the small number of cells. The pluralityof cells can include a tissue, a cell monolayer, fixed cells, a tissuesection, or any combination thereof. Amplifying the labeled targetmolecules can include bridge amplification, amplification with a genespecific primer, a universal primer, an oligo(dT) primer, or anycombination thereof.

Disclosed herein are methods for identifying distinct cells in two ormore samples. In some embodiments, the methods can include:stochastically barcoding a plurality of targets in the two or moresamples using a plurality of stochastic barcodes, wherein each of theplurality of stochastic barcodes can include a spatial label and amolecular label; estimating the number of the plurality of targets inthe two or more samples using the molecular label; and distinguishingthe two or more samples from each other using the spatial label, whereinthe plurality of targets associated with stochastic barcodes withdifferent spatial labels can be from different samples.

In some embodiments, stochastically barcoding the plurality of targetsin the two or more samples can include hybridizing the plurality ofstochastic barcodes with the plurality of targets to generatestochastically barcoded targets, and at least one of the plurality oftargets can be hybridized to one of the plurality of stochasticbarcodes. Stochastically barcoding the plurality of targets in the twoor more samples can include generating an indexed library of thestochastically barcoded targets. The spatial label can include 5-20nucleotides. The molecular label can include 5-20 nucleotides. Each ofthe two or more samples can include a plurality of cells and theplurality of targets can be associated with the plurality of cells. Theplurality of targets can include ribonucleic acids (RNAs), messengerRNAs (mRNAs), microRNAs, small interfering RNAs (siRNAs), RNAdegradation products, RNAs each including a poly(A) tail, and anycombination thereof. Stochastically barcoding the plurality of targetsin the two or more samples can be performed with a solid supportincluding a plurality of synthetic particles associated with theplurality of stochastic barcodes. The synthetic particles can be beads.The beads can be silica gel beads, controlled pore glass beads, magneticbeads, Dynabeads, Sephadex/Sepharose beads, cellulose beads, polystyrenebeads, or any combination thereof.

Disclosed herein are kits for determining the number and spatiallocations of a plurality of targets in a sample. In some embodiments,the kits can include: a plurality of stochastic barcodes, wherein eachof the plurality of stochastic barcodes can include a spatial label,wherein the spatial labels of the plurality of stochastic barcodesdiffer from one another by at least one nucleotide; and instructions forusing the plurality of stochastic barcodes. The plurality of stochasticbarcodes can be associated with a solid support. The solid support caninclude a plurality of synthetic particles associated with the pluralityof synthetic particles.

In some embodiments, each of the plurality of synthetic particles caninclude a first group of optical labels and a second group of opticallabels, and each optical label in the first group of optical labels caninclude a first optical moiety, each optical label in the second groupof optical labels can include a second optical moiety, and the firstoptical moiety and the second optical moiety can be selected from agroup including two or more spectrally-distinct optical moieties. Eachof the plurality of stochastic barcodes can include one or more of amolecular label, a universal label, and a cellular label, whereinuniversal labels and cellular labels of all stochastic barcodes on thesolid support can be the same.

In some embodiments, the solid support can include the plurality ofstochastic barcodes in two dimensions or three dimensions. The pluralityof synthetic particles can be beads. The beads can be silica gel beads,controlled pore glass beads, magnetic beads, Dynabeads,Sephadex/Sepharose beads, cellulose beads, polystyrene beads, or anycombination thereof. The synthetic particles can be magnetic beads. Thesolid support can include a polymer, a matrix, a hydrogel, a needlearray device, an antibody, or any combination thereof. In someembodiments, the kits can include a buffer. The kits can include acartridge. The solid support can be pre-loaded on a substrate. The kitscan include one or more reagents for a reverse transcription reaction.The kits can include one or more reagents for an amplification reaction.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows a non-limiting exemplary embodiment for determining spatiallocations of distinct targets in a sample.

FIG. 2 illustrates a non-limiting exemplary stochastic barcode.

FIG. 3 shows a non-limiting exemplary workflow of stochastic barcodingand digital counting.

FIG. 4 is a schematic illustration showing a non-limiting exemplaryprocess for generating an indexed library of the stochastically barcodedtargets from a plurality of targets.

FIG. 5 illustrates a non-limiting exemplary embodiment for determiningspatial locations of targets in a sample by maintaining physiologicalorientation of sections of the sample.

FIG. 6 illustrates a non-limiting exemplary embodiment for determiningspatial locations of targets in a sample by time points.

FIG. 7 illustrates a non-limiting exemplary embodiment for determiningspatial locations of targets in a sample by randomizing the orientationof sections of the sample.

FIG. 8 shows a non-limiting exemplary schematic of label lithography.

FIG. 9 shows a non-limiting exemplary embodiment for determining spatiallocations of targets in a sample using label lithography.

FIG. 10 shows a non-limiting exemplary embodiment for distinguishingtargets of a sample for a plurality of samples.

FIG. 11 shows a non-limiting exemplary embodiment for distinguishingsubcellular localization of targets in a cell.

FIG. 12 illustrates a non-limiting exemplary embodiment for homopolymertailing.

FIG. 13 shows a non-limiting exemplary instrument used in the methods ofthe disclosure.

FIG. 14 illustrates a non-limiting exemplary architecture of a computersystem that can be used in connection with embodiments of the presentdisclosure.

FIG. 15 illustrates a non-limiting exemplary architecture showing anetwork with a plurality of computer systems for use in the methods ofthe disclosure.

FIG. 16 illustrates a non-limiting exemplary architecture of amultiprocessor computer system using a shared virtual address memoryspace in accordance with the methods of the disclosure.

FIGS. 17A-C depicts a non-limiting exemplary cartridge for use in themethods of the disclosure.

FIGS. 18A-B shows different arrangements of optical labels on thesurface of a synthetic particle.

FIG. 19 shows the hybridization of oligonucleotides in the firstencoding step of Example 4.

FIG. 20 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the first plate.

FIG. 21 shows the single stranded oligonucleotides in the variousregions on the synthetic particles after polymerization, ligation, anddenaturation of duplex DNA in the first encoding step.

FIG. 22 shows the hybridization of oligonucleotides in the secondencoding step of Example 4.

FIG. 23 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the second plate.

FIG. 24 shows the single stranded oligonucleotides in the variousregions on the synthetic particles after polymerization, ligation, anddenaturation of duplex DNA in the second encoding step.

FIG. 25 shows the hybridization of oligonucleotides in the thirdencoding step of example 4.

FIG. 26 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the third plate.

FIG. 27 shows the single stranded oligonucleotides in the variousregions on the synthetic particles after polymerization, ligation, anddenaturation of duplex DNA in the third encoding step.

FIG. 28 is a schematic illustration of a non-limiting exemplarysynthetic particle being coated with DNA barcodes and the spectrallyresolvable barcode.

FIG. 29 shows an exemplary combination of the spectrally resolvablebarcode, PS1-9, of a synthetic particle.

DETAILED DESCRIPTION

In the following detailed description, reference is made to theaccompanying drawings, which form a part hereof. In the drawings,similar symbols typically identify similar components, unless contextdictates otherwise. The illustrative embodiments described in thedetailed description, drawings, and claims are not meant to be limiting.Other embodiments may be utilized, and other changes may be made,without departing from the spirit or scope of the subject matterpresented herein. It will be readily understood that the aspects of thepresent disclosure, as generally described herein, and illustrated inthe Figures, can be arranged, substituted, combined, separated, anddesigned in a wide variety of different configurations, all of which areexplicitly contemplated herein and made part of the disclosure herein.

In one aspect the disclosure provides for a method for determining thenumber and spatial location of one or more targets in a samplecomprising: contacting the spatial location in the sample with one ormore stochastic barcodes, wherein each stochastic barcode comprises aspatial label and a molecular label; estimating the number of the one ormore targets in the spatial location using the molecular label; andidentifying the spatial location of the one or more targets using thespatial label. In some embodiments, the contacting comprises hybridizingthe stochastic barcode with the one or more targets. In someembodiments, the hybridizing comprises hybridizing the one or moretargets such that each of the one or more targets is hybridized to aunique stochastic barcode. In some embodiments, molecular labels of thestochastic barcodes are different. In some embodiments, the sample isphysically divided during the contacting. In some embodiments, thesample is intact during the contacting. In some embodiments, thecontacting is performed on the surface of the sample. In someembodiments, the contacting is performed inside the sample. In someembodiments, the contacting is performed subcellularly in the sample. Insome embodiments, the spatial location is subcellular. In someembodiments, the contacting is performed on a substrate. In someembodiments, the substrate comprises the one or more stochastic barcodesin a known order. In some embodiments, the substrate comprises the oneor more stochastic barcodes in an unknown order. In some embodiments,the method further comprises decoding the substrate. In someembodiments, the spatial label comprises from 5-20 nucleotides. In someembodiments, the estimating comprises generating a target-barcodemolecule. In some embodiments, the target-barcode molecule comprises thesequence of a stochastic barcode to which it is associated. In someembodiments, the estimating further comprises determining the sequenceof the spatial label and the molecular label. In some embodiments, themethod further comprises counting occurrences of distinct sequences ofthe molecular label. In some embodiments, the counting is used toestimate the number of one or more targets. In some embodiments, thedetermining comprises sequencing the stochastic barcodes. In someembodiments, the sequencing comprises sequencing with read lengths of atleast 100 bases. In some embodiments, the sequencing comprisessequencing with read lengths of at least 500 bases. In some embodiments,the identifying comprises correlating the spatial label with the spatiallocation in the sample. In some embodiments, the method furthercomprises visualizing the number of the one or more targets at thespatial location. In some embodiments, the visualizing comprises mappingthe number of the one or more targets onto a map of the sample. In someembodiments, the visualizing comprises imaging the sample at a timepoint selected from the group consisting of: imaging the sample prior tothe contacting, imaging the sample after the contacting, imaging thesample before lysing the sample, imaging the sample after lysing thesample. In some embodiments, the imaging produces an image that is usedto construct a map of a physical representation of the sample. In someembodiments, the map is two dimensional. In some embodiments, the map isthree dimensional. In some embodiments, the sample comprises a singlecell. In some embodiments, the sample comprises a plurality of cells. Insome embodiments, the plurality of cells comprises a one or moredifferent cell types. In some embodiments, the one or more cell typesare selected from the group consisting of: brain cells, heart cells,cancer cells, circulating tumor cells, organ cells, epithelial cells,metastatic cells, benign cells, primary cells, and circulatory cells, orany combination thereof. In some embodiments, the sample comprises asolid tissue. In some embodiments, the sample is obtained from asubject. In some embodiments, the subject is a subject selected from thegroup consisting of: a human, a mammal, a dog, a rat, a mouse, a fish, afly, a worm, a plant, a fungus, a bacterium, a virus, a vertebrate, andan invertebrate. In some embodiments, the one or more targets areribonucleic acid molecules. In some embodiments, the ribonucleic acidmolecules are selected from the group consisting of: mRNA, microRNA,mRNA degradation products, and ribonucleic acids comprising a poly(A)tail, or any combination thereof. In some embodiments, the targets aredeoxyribonucleic acid molecules. In some embodiments, the contacting isperformed with a solid support. In some embodiments, the solid supportcan comprise a plurality of stochastic barcodes. In some embodiments,each stochastic barcode of the plurality of stochastic barcodescomprises a spatial label. In some embodiments, spatial labels ondifferent solid supports differ by at least one nucleotide. In someembodiments, the stochastic barcode further comprises a universal label,and a cellular label. In some embodiments, the universal label and thecellular label are the same for all stochastic barcodes on the solidsupport. In some embodiments, the solid support comprises stochasticbarcodes in two dimensions. In some embodiments, the solid supportcomprises stochastic barcodes in three dimensions. In some embodiments,the solid supports comprise a bead. In some embodiments, the bead isselected from the group consisting of: silica gel bead, controlled poreglass bead, magnetic bead, Dynabeads, Sephadex/Sepharose beads,cellulose beads, and polystyrene beads, or any combination thereof. Insome embodiments, the bead comprises a magnetic bead. In someembodiments, the solid support is semi-solid. In some embodiments, thesolid support comprises a polymer, a matrix, or a hydrogel. In someembodiments, the solid support comprises a needle array device. In someembodiments, the solid support comprises an antibody. In someembodiments the solid support comprises polystyrene.

In one aspect the disclosure provides for a method for determiningspatial locations of one or more targets in a sample by timingcomprising: contacting the spatial location in the sample with one ormore stochastic barcodes, at one or more time points, wherein eachstochastic barcode comprises a spatial label; and identifying thespatial location of the one or more targets in the sample, wherein theone or more time point correlates to the spatial location. In someembodiments, stochastic barcodes at different time points comprisedifferent dimension labels. In some embodiments, the dimension labelscorrelate to the one or more times points. In some embodiments, thecontacting is performed by a device. In some embodiments, the device isa device selected from the group consisting of: a needle, a needlearray, a tube, a suction device, an injection device, an electroporationdevice, a fluorescent activated cell sorter device, and a microfluidicdevice, or any combination thereof. In some embodiments, the devicecontacts the sections at a specified rate. In some embodiments, thespecified rate is used to correlate the time point with the spatiallocation. In some embodiments, the contacting comprises hybridizing theone or more stochastic barcodes with the one or more targets. In someembodiments, the hybridizing comprises hybridizing the one or moretargets such that each of the one or more targets is hybridized to aunique stochastic barcode. In some embodiments, the one or morestochastic barcodes comprises a molecular label. In some embodiments,the molecular label is different for each of the one or more stochasticbarcodes. In some embodiments, the sample is physically divided duringthe contacting. In some embodiments, the sample is intact during thecontacting. In some embodiments, the contacting is performed on thesurface of the sample. In some embodiments, the contacting is performedinside the sample. In some embodiments, the contacting is performedsubcellularly in the sample. In some embodiments, the spatial locationis subcellular. In some embodiments, the contacting is performed on asubstrate. In some embodiments, the substrate comprises the one or morestochastic barcodes in a known order. In some embodiments, the substratecomprises the one or more stochastic barcodes in an unknown order. Insome embodiments, the method further comprises decoding the substrate.In some embodiments, the spatial label comprises from 5-20 nucleotides.In some embodiments, the method further comprises estimating the numberof the one or more targets using the stochastic barcode. In someembodiments, the estimating comprises generating a target-barcodemolecule. In some embodiments, the target-barcode molecule comprises thesequence of a stochastic barcode to which it is associated. In someembodiments, the estimating further comprises determining the sequenceof the spatial label and a molecular label. In some embodiments, themethod further comprises counting occurrences of distinct sequences ofthe molecular label. In some embodiments, the counting is used toestimate the number of one or more targets. In some embodiments, thedetermining comprises sequencing the stochastic barcodes. In someembodiments, the sequencing comprises sequencing with read lengths of atleast 100 bases. In some embodiments, the sequencing comprisessequencing with read lengths of at least 500 bases. In some embodiments,the identifying comprises correlating the spatial label with the spatiallocation in the sample. In some embodiments, the method furthercomprises visualizing the number of the one or more targets at thespatial location. In some embodiments, the visualizing comprises mappingthe number of the one or more targets onto a map of the sample. In someembodiments, the visualizing comprises imaging the sample at a timepoint selected from the group consisting of: imaging the sample prior tothe contacting, imaging the sample after the contacting, imaging thesample before lysing the sample, imaging the sample after lysing thesample. In some embodiments, the imaging produces an image that is usedto construct a map of a physical representation of the sample. In someembodiments, the map is two dimensional. In some embodiments, the map isthree dimensional. In some embodiments, the sample comprises a singlecell. In some embodiments, the sample comprises a plurality of cells. Insome embodiments, the plurality of cells comprises a one or moredifferent cell types. In some embodiments, the one or more cell typesare selected from the group consisting of: brain cells, heart cells,cancer cells, circulating tumor cells, organ cells, epithelial cells,metastatic cells, benign cells, primary cells, and circulatory cells, orany combination thereof. In some embodiments, the sample comprises asolid tissue. In some embodiments, the sample is obtained from asubject. In some embodiments, the subject is a subject selected from thegroup consisting of: a human, a mammal, a dog, a rat, a mouse, a fish, afly, a worm, a plant, a fungus, a bacterium, a virus, a vertebrate, andan invertebrate. In some embodiments, the one or more targets areribonucleic acid molecules. In some embodiments, the ribonucleic acidmolecules are selected from the group consisting of: mRNA, microRNA,mRNA degradation products, and ribonucleic acids comprising a poly(A)tail, or any combination thereof. In some embodiments, the targets aredeoxyribonucleic acid molecules. In some embodiments, the contacting isperformed with a solid support. In some embodiments, the solid supportcomprises a plurality of stochastic barcodes. In some embodiments, eachstochastic barcode of the plurality of stochastic barcodes comprises aspatial label. In some embodiments, spatial labels on different solidsupports differ by at least one nucleotide. In some embodiments, thestochastic barcode further comprises a universal label, a cellularlabel, and a molecular label. In some embodiments, the universal labeland the cellular label are the same for all stochastic barcodes on asolid support. In some embodiments, the solid supports comprisestochastic barcodes in two dimensions. In some embodiments, the solidsupports comprise stochastic barcodes in three dimensions. In someembodiments, the solid supports comprise a bead. In some embodiments,the bead is selected from the group consisting of: silica gel bead,controlled pore glass bead, magnetic bead, Dynabeads, Sephadex/Sepharosebeads, cellulose beads, and polystyrene beads, or any combinationthereof. In some embodiments, the bead comprises a magnetic bead. Insome embodiments, the solid support is semi-solid. In some embodiments,the solid support comprises a polymer, a matrix, or a hydrogel. In someembodiments, the solid support comprises a needle array device. In someembodiments, the solid support comprises an antibody.

In one aspect the disclosure provides for a method for determining thespatial location of one or more targets on a sample comprising:contacting one or more spatial locations of the sample with one or morestochastic barcodes, wherein each stochastic barcode comprises apre-spatial label; concatenating one or more spatial label blocks ontothe pre-spatial label, thereby generating a spatial label; andidentifying the one or more spatial locations of the one or more targetsin the sample by correlating a length of the spatial label with aspatial location in the sample. In some embodiments, spatial labels atdistinct spatial locations have different lengths. In some embodiments,the contacting comprises hybridizing the stochastic barcode with the oneor more targets. In some embodiments, the hybridizing compriseshybridizing the one or more targets such that each of the one or moretargets is hybridized to a unique stochastic barcode. In someembodiments, the pre-spatial label comprises a molecular label. In someembodiments, the molecular label is different for each of the one ormore stochastic barcodes. In some embodiments, the sample is physicallydivided during the contacting. In some embodiments, the sample is intactduring the contacting. In some embodiments, the contacting is performedon the surface of the sample. In some embodiments, the contacting isperformed inside the sample. In some embodiments, the contacting isperformed subcellularly in the sample. In some embodiments, the spatiallocation is subcellular. In some embodiments, the contacting isperformed on a substrate. In some embodiments, the substrate comprisesthe one or more stochastic barcodes in a known order. In someembodiments, the substrate comprises the one or more stochastic barcodesin an unknown order. In some embodiments, the method further comprisesdecoding the substrate. In some embodiments, the spatial label comprisesfrom 5-20 nucleotides. In some embodiments, the method further comprisesestimating the number of the distinct targets using the stochasticbarcodes. In some embodiments, the estimating comprises generating atarget-barcode molecule. In some embodiments, the target-barcodemolecule comprises the sequence of a stochastic barcode to which it isassociated. In some embodiments, the estimating further comprisesdetermining the sequence of the spatial label and a molecular label. Insome embodiments, the method further comprises counting occurrences ofdistinct sequences of the molecular label. In some embodiments, thecounting is used to estimate the number of one or more targets. In someembodiments, the determining comprises sequencing the stochasticbarcodes. In some embodiments, the sequencing comprises sequencing withread lengths of at least 100 bases. In some embodiments, the sequencingcomprises sequencing with read lengths of at least 500 bases. In someembodiments, the identifying comprises correlating the spatial labelwith the spatial location in the sample. In some embodiments, the methodfurther comprises visualizing the number of the one or more targets atthe spatial location. In some embodiments, the visualizing comprisesmapping the number of the one or more targets onto a map of the sample.In some embodiments, the visualizing comprises imaging the sample at atime point selected from the group consisting of: imaging the sampleprior to the contacting, imaging the sample after the contacting,imaging the sample before lysing the sample, imaging the sample afterlysing the sample. In some embodiments, the imaging produces an imagethat is used to construct a map of a physical representation of thesample. In some embodiments, the map is two dimensional. In someembodiments, the map is three dimensional. In some embodiments, thesample comprises a single cell. In some embodiments, the samplecomprises a plurality of cells. In some embodiments, the plurality ofcells comprises a one or more different cell types. In some embodiments,the one or more cell types are selected from the group consisting of:brain cells, heart cells, cancer cells, circulating tumor cells, organcells, epithelial cells, metastatic cells, benign cells, primary cells,and circulatory cells, or any combination thereof. In some embodiments,the sample comprises a solid tissue. In some embodiments, the sample isobtained from a subject. In some embodiments, the subject is a subjectselected from the group consisting of: a human, a mammal, a dog, a rat,a mouse, a fish, a fly, a worm, a plant, a fungus, a bacterium, a virus,a vertebrate, and an invertebrate. In some embodiments, the one or moretargets are ribonucleic acid molecules. In some embodiments, theribonucleic acid molecules are selected from the group consisting of:mRNA, microRNA, mRNA degradation products, and ribonucleic acidscomprising a poly(A) tail, or any combination thereof. In someembodiments, the targets are deoxyribonucleic acid molecules. In someembodiments, the contacting is performed with a solid support. In someembodiments, the solid support can comprise a plurality of stochasticbarcodes. In some embodiments, each stochastic barcode of the pluralityof stochastic barcodes comprises a spatial label. In some embodiments,spatial labels on different solid supports differ by at least onenucleotide. In some embodiments, the stochastic barcode furthercomprises a universal label, and a cellular label. In some embodiments,the universal label and the cellular label are the same for allstochastic barcodes on the solid support. In some embodiments, the solidsupport comprises stochastic barcodes in two dimensions. In someembodiments, the solid support comprises stochastic barcodes in threedimensions. In some embodiments, the solid support comprises a bead. Insome embodiments, the bead is selected from the group consisting of:silica gel bead, controlled pore glass bead, magnetic bead, Dynabeads,Sephadex/Sepharose beads, cellulose beads, and polystyrene beads, or anycombination thereof. In some embodiments, the bead comprises a magneticbead. In some embodiments, the solid support is semi-solid. In someembodiments, the solid support comprises a polymer, a matrix, or ahydrogel. In some embodiments, the solid support comprises a needlearray device. In some embodiments, the solid support comprises anantibody.

In one aspect the disclosure provides for a method for identifyingdistinct cells in a population of cells comprising: contacting two ormore samples to a substrate, wherein the substrate comprises one or moretypes of stochastic barcodes, wherein each type of the types ofstochastic barcodes comprises a different spatial label, and whereineach stochastic barcode comprises a molecular label; estimating thenumber of one or more targets in the plurality of samples using themolecular label; and distinguishing a sample from the two or more ofsamples by the spatial labels, wherein targets associated with differentspatial labels originate from different samples. In some embodiments,the contacting comprises hybridizing the stochastic barcode with the oneor more targets. In some embodiments, the hybridizing compriseshybridizing the one or more targets such that each of the one or moretargets is hybridized to a unique stochastic barcode. In someembodiments, molecular labels of the stochastic barcodes are different.In some embodiments, the two or more samples are physically divided fromeach other during the contacting. In some embodiments, the two or moresamples can be intact during the contacting. In some embodiments, thecontacting is performed on the surface of the two or more samples. Insome embodiments, the contacting is performed inside the two or moresamples. In some embodiments, the contacting is performed subcellularlyin the two or more samples. In some embodiments, the spatial location issubcellular. In some embodiments, the substrate comprises the one ormore stochastic barcodes in a known order. In some embodiments, thesubstrate comprises the one or more stochastic barcodes in an unknownorder. In some embodiments, the method further comprises decoding thesubstrate. In some embodiments, the spatial label comprises from 5-20nucleotides. In some embodiments, the estimating comprises generating atarget-barcode molecule. In some embodiments, the target-barcodemolecule comprises the sequence of a stochastic barcode to which it isassociated. In some embodiments, the estimating further comprisesdetermining the sequence of the spatial label and the molecular label.In some embodiments, the method further comprises counting occurrencesof distinct sequences of the molecular label. In some embodiments, thecounting is used to estimate the number of one or more targets. In someembodiments, the determining comprises sequencing the stochasticbarcodes. In some embodiments, the sequencing comprises sequencing withread lengths of at least 100 bases. In some embodiments, the sequencingcomprises sequencing with read lengths of at least 500 bases. In someembodiments, the method further comprises visualizing the number of theone or more targets at the spatial location. In some embodiments, thevisualizing comprises mapping the number of the one or more targets ontoa map of the sample. In some embodiments, the visualizing comprisesimaging the sample at a time point selected from the group consistingof: imaging the sample prior to the contacting, imaging the sample afterthe contacting, imaging the sample before lysing the sample, imaging thesample after lysing the sample. In some embodiments, the imagingproduces an image that is used to construct a map of a physicalrepresentation of the sample. In some embodiments, the map is twodimensional. In some embodiments, the map is three dimensional. In someembodiments, the sample comprises a single cell. In some embodiments,the sample comprises a plurality of cells. In some embodiments, theplurality of cells comprises a one or more different cell types. In someembodiments, the one or more cell types are selected from the groupconsisting of: brain cells, heart cells, cancer cells, circulating tumorcells, organ cells, epithelial cells, metastatic cells, benign cells,primary cells, and circulatory cells, or any combination thereof. Insome embodiments, the sample comprises a solid tissue. In someembodiments, the sample is obtained from a subject. In some embodiments,the subject is a subject selected from the group consisting of: a human,a mammal, a dog, a rat, a mouse, a fish, a fly, a worm, a plant, afungus, a bacterium, a virus, a vertebrate, and an invertebrate. In someembodiments, the one or more targets are ribonucleic acid molecules. Insome embodiments, the ribonucleic acid molecules are selected from thegroup consisting of: mRNA, microRNA, mRNA degradation products, andribonucleic acids comprising a poly(A) tail, or any combination thereof.In some embodiments, the targets are deoxyribonucleic acid molecules. Insome embodiments, the contacting is performed with a solid support. Insome embodiments, the solid support comprises a plurality of stochasticbarcodes. In some embodiments, each stochastic barcode of the pluralityof stochastic barcodes comprises a spatial label. In some embodiments,spatial labels on different solid supports differ by at least onenucleotide. In some embodiments, the stochastic barcode furthercomprises a universal label, and a cellular label. In some embodiments,the universal label and the cellular label are the same for allstochastic barcodes on the solid support. In some embodiments, the solidsupport comprises stochastic barcodes in two dimensions. In someembodiments, the solid support comprises stochastic barcodes in threedimensions. In some embodiments, the solid support comprises a bead. Insome embodiments, the bead is selected from the group consisting of:silica gel bead, controlled pore glass bead, magnetic bead, Dynabeads,Sephadex/Sepharose beads, cellulose beads, and polystyrene beads, or anycombination thereof. In some embodiments, the bead comprises a magneticbead. In some embodiments, the solid support is semi-solid. In someembodiments, the solid support comprises a polymer, a matrix, or ahydrogel. In some embodiments, the solid support comprises a needlearray device. In some embodiments, the solid support comprises anantibody.

In one aspect the disclosure provides for a kit comprising: one or moretypes of stochastic barcodes, wherein each stochastic barcode of the oneor more types of stochastic barcodes comprises a spatial label, whereinspatial labels of the one or more types of stochastic barcodes differ byat least one nucleotide; and instructions for use. In some embodiments,the one or more types of stochastic barcodes are attached to a solidsupport. In some embodiments, the one or more types of stochasticbarcodes are attached to a substrate. In some embodiments, the kitfurther comprises a buffer. In some embodiments, the kit furthercomprises a cartridge. In some embodiments, the one or more supports arepre-loaded on a substrate. In some embodiments, the kit furthercomprises reagents for a reverse transcription reaction. In someembodiments, the kit further comprises reagents for an amplificationreaction.

In one aspect, the disclosure provides for a method comprising: imaginga sample contacted to a substrate comprising a plurality of probes,thereby producing an image; lysing the sample thereby releasing nucleicacids from the sample; analyzing the nucleic acids from the sample atlocations on the substrate; correlating locations on the image with datafrom the analyzing to identify a spatial location of a nucleic acid in asample. In some embodiments, the imaging comprises staining the sample.In some embodiments, the staining comprises staining with a stainselected from the group consisting of: a fluorescent stain, a negativestain, and an antibody stain, or any combination thereof. In someembodiments, the imaging use a technique selected from the groupconsisting of: optical microscopy, electron microscopy, confocalmicroscopy, and fluorescence microscopy. In some embodiments, theperforming immunohistological analysis produces an image. In someembodiments, the sample comprises a cell monolayer. In some embodiments,the sample comprises fixed cells. In some embodiments, the samplecomprises a tissue section. In some embodiments, the lysing is performedby heating the sample, contacting the sample with a detergent, orchanging the pH of the sample, or any combination thereof. In someembodiments, the analyzing comprises hybridizing the nucleic acids tothe oligo(dT)s. In some embodiments, the nucleic acids comprisepolyadenylated nucleic acids. In some embodiments, the method furthercomprises homopolymer tailing the nucleic acids. In some embodiments,the method further comprises amplifying the nucleic acids. In someembodiments, the amplifying comprises bridge amplification. In someembodiments, the amplifying comprises amplifying with a gene-specificprimer. In some embodiments, the amplifying comprises amplifying with auniversal primer. In some embodiments, the amplifying comprisesamplifying with an oligo(dT) primer. In some embodiments, the methodfurther comprises detecting the nucleic acids. In some embodiments, thedetecting comprises hybridizing one or more probes to the nucleic acids.In some embodiments, the one or more probes comprise a fluorescentlabel. In some embodiments, the one or more probes can be 4 probes. Insome embodiments, the analyzing comprises hybridizing the nucleic acidsto a microarray. In some embodiments, the correlating comprisesoverlaying the image with the data. In some embodiments, the correlatingcomprises mapping the x-y location of a feature on the substrate ontothe image. In some embodiments, the probes comprise oligo(dT). In someembodiments, the probes comprise gene-specific probes. In someembodiments, the probes comprise a combination of oligo(dT) probes andgene-specific probes. In some embodiments, the gene-specific probes aregene-specific for at least 2 genes.

In one aspect the disclosure provides for a method for diagnosing asubject comprising: imaging a sample from the subject contacted to asubstrate comprising a plurality of probes, thereby producing an image;lysing the sample thereby releasing nucleic acids from the sample;analyzing the nucleic acids from the sample at locations on thesubstrate; diagnosing the subject based on the image and data from theanalyzing. In some embodiments, the subject is a human. In someembodiments, the subject is a mouse, a dog, a rat, or a vertebrate. Insome embodiments, the diagnosing comprises identifying different celltypes of the sample. In some embodiments, the diagnosing comprisesdetermining if different cell types respond to a therapy. In someembodiments, the diagnosing comprises determining a genotype of one ormore cells in the sample. In some embodiments, the method furthercomprises treating the subject. In some embodiments, the treatingcomprises administering a drug to the subject. In some embodiments, thedrug is chosen based on predicted responsiveness to the identified celltypes of the sample. In some embodiments, the imaging comprises stainingthe sample. In some embodiments, the staining comprises staining with astain selected from the group consisting of: a fluorescent stain, anegative stain, and an antibody stain, or any combination thereof. Insome embodiments, the imaging use a technique selected from the groupconsisting of: optical microscopy, electron microscopy, confocalmicroscopy, and fluorescence microscopy. In some embodiments, theperforming immunohistological analysis produces an image. In someembodiments, the sample comprises a cell monolayer. In some embodiments,the sample comprises fixed cells. In some embodiments, the samplecomprises a tissue section. In some embodiments, the lysing is performedby heating the sample, contacting the sample with a detergent, orchanging the pH of the sample, or any combination thereof. In someembodiments, the analyzing comprises hybridizing the nucleic acids tothe oligo(dT)s. In some embodiments, the nucleic acids comprisepolyadenylated nucleic acids. In some embodiments, the method furthercomprises homopolymer tailing the nucleic acids. In some embodiments,the method further comprises amplifying the nucleic acids. In someembodiments, the amplifying comprises bridge amplification. In someembodiments, the amplifying comprises amplifying with a gene-specificprimer. In some embodiments, the amplifying comprises amplifying with auniversal primer. In some embodiments, the amplifying comprisesamplifying with an oligo(dT) primer. In some embodiments, the methodfurther comprises detecting the nucleic acids. In some embodiments, thedetecting comprises hybridizing one or more probes to the nucleic acids.In some embodiments, the one or more probes comprise a fluorescentlabel. In some embodiments, the one or more probes can be 4 probes. Insome embodiments, the analyzing comprises hybridizing the nucleic acidsto a microarray. In some embodiments, the probes comprise oligo(dT). Insome embodiments, the probes comprise gene-specific probes. In someembodiments, the probes comprise a combination of oligo(dT) probes andgene-specific probes. In some embodiments, the gene-specific probes aregene-specific for at least 2 genes.

In one aspect, the disclosure provides for a method comprising: imaginga sample contacted to a first substrate comprising a plurality ofprobes, thereby producing an image; lysing the sample thereby releasingnucleic acids from the sample to hybridize to the plurality of probes;analyzing the nucleic acids from the sample at locations on thesubstrate; and replicating the first substrate thereby making areplicate substrate. In some embodiments, the probes of the firstsubstrate comprise oligo(dT). In some embodiments, the probes of thefirst substrate comprise gene-specific primers. In some embodiments,probes of the replicate substrate comprise gene-specific primers foranother location on the same gene as the gene-specific primers on thefirst substrate. In some embodiments, the replicating comprisescontacting the first substrate with a replicate substrate. In someembodiments, the replicating comprises hybridizing nucleic acids fromthe first substrate to the replicate substrate. In some embodiments, theimaging comprises staining the sample. In some embodiments, the stainingcomprises staining with a stain selected from the group consisting of: afluorescent stain, a negative stain, and an antibody stain, or anycombination thereof. In some embodiments, the imaging use a techniqueselected from the group consisting of: optical microscopy, electronmicroscopy, confocal microscopy, and fluorescence microscopy. In someembodiments, the performing immunohistological analysis produces animage. In some embodiments, the sample comprises a cell monolayer. Insome embodiments, the sample comprises fixed cells. In some embodiments,the sample comprises a tissue section. In some embodiments, the lysingis performed by heating the sample, contacting the sample with adetergent, or changing the pH of the sample, or any combination thereof.In some embodiments, the analyzing comprises hybridizing the nucleicacids to the oligo(dT)s. In some embodiments, the method furthercomprises homopolymer tailing the nucleic acids. In some embodiments,the method further comprises amplifying the nucleic acids to generateamplicons. In some embodiments, the amplifying comprises bridgeamplification. In some embodiments, the amplifying comprises amplifyingwith a gene-specific primer. In some embodiments, the amplifyingcomprises amplifying with a universal primer. In some embodiments, theamplifying comprises amplifying with an oligo(dT) primer. In someembodiments, the replicating comprises hybridizing the amplicons ontothe replicate substrate.

Definitions

Unless otherwise defined, all technical terms used herein have the samemeaning as commonly understood by one of ordinary skill in the art inthe field to which this disclosure belongs. As used in thisspecification and the appended claims, the singular forms “a,” “an,” and“the” include plural references unless the context clearly dictatesotherwise. Any reference to “or” herein is intended to encompass“and/or” unless otherwise stated.

As used herein, the term “adaptor” can mean a sequence to facilitateamplification or sequencing of associated nucleic acids. The associatednucleic acids can comprise target nucleic acids. The associated nucleicacids can comprise one or more of spatial labels, target labels, samplelabels, indexing label, barcodes, stochastic barcodes, or molecularlabels. The adapters can be linear. The adaptors can be pre-adenylatedadapters. The adaptors can be double- or single-stranded. One or moreadaptor can be located on the 5′ or 3′ end of a nucleic acid. When theadaptors comprise known sequences on the 5′ and 3′ ends, the knownsequences can be the same or different sequences. An adaptor located onthe 5′ and/or 3′ ends of a polynucleotide can be capable of hybridizingto one or more oligonucleotides immobilized on a surface. An adaptercan, in some embodiments, comprise a universal sequence. A universalsequence can be a region of nucleotide sequence that is common to two ormore nucleic acid molecules. The two or more nucleic acid molecules canalso have regions of different sequence. Thus, for example, the 5′adapters can comprise identical and/or universal nucleic acid sequencesand the 3′ adapters can comprise identical and/or universal sequences. Auniversal sequence that may be present in different members of aplurality of nucleic acid molecules can allow the replication oramplification of multiple different sequences using a single universalprimer that is complementary to the universal sequence. Similarly, atleast one, two (e.g., a pair) or more universal sequences that may bepresent in different members of a collection of nucleic acid moleculescan allow the replication or amplification of multiple differentsequences using at least one, two (e.g., a pair) or more singleuniversal primers that are complementary to the universal sequences.Thus, a universal primer includes a sequence that can hybridize to sucha universal sequence. The target nucleic acid sequence-bearing moleculesmay be modified to attach universal adapters (e.g., non-target nucleicacid sequences) to one or both ends of the different target nucleic acidsequences. The one or more universal primers attached to the targetnucleic acid can provide sites for hybridization of universal primers.The one or more universal primers attached to the target nucleic acidcan be the same or different from each other.

As used herein, the term “associated” or “associated with” can mean thattwo or more species are identifiable as being co-located at a point intime. An association can mean that two or more species are or werewithin a similar container. An association can be an informaticsassociation, where for example digital information regarding two or morespecies is stored and can be used to determine that one or more of thespecies were co-located at a point in time. An association can also be aphysical association. In some embodiments two or more associated speciesare “tethered”, “attached”, or “immobilized” to one another or to acommon solid or semisolid surface. An association may refer to covalentor non-covalent means for attaching labels to solid or semi-solidsupports such as beads. An association may be a covalent bond between atarget and a label.

As used herein, the term “complementary” can refer to the capacity forprecise pairing between two nucleotides. For example, if a nucleotide ata given position of a nucleic acid is capable of hydrogen bonding with anucleotide of another nucleic acid, then the two nucleic acids areconsidered to be complementary to one another at that position.Complementarity between two single-stranded nucleic acid molecules maybe “partial,” in which only some of the nucleotides bind, or it may becomplete when total complementarity exists between the single-strandedmolecules. A first nucleotide sequence can be said to be the“complement” of a second sequence if the first nucleotide sequence iscomplementary to the second nucleotide sequence. A first nucleotidesequence can be said to be the “reverse complement” of a secondsequence, if the first nucleotide sequence is complementary to asequence that is the reverse (i.e., the order of the nucleotides isreversed) of the second sequence. As used herein, the terms“complement”, “complementary”, and “reverse complement” can be usedinterchangeably. It is understood from the disclosure that if a moleculecan hybridize to another molecule it may be the complement of themolecule that is hybridizing.

As used herein, the term “digital counting” can refer to a method forestimating a number of target molecules in a sample. Digital countingcan include the step of determining a number of unique labels that havebeen associated with targets in a sample. This stochastic methodologytransforms the problem of counting molecules from one of locating andidentifying identical molecules to a series of yes/no digital questionsregarding detection of a set of predefined labels.

As used herein, the term “label” or “labels” can refer to nucleic acidcodes associated with a target within a sample. A label can be, forexample, a nucleic acid label. A label can be an entirely or partiallyamplifiable label. A label can be entirely or partially sequencablelabel. A label can be a portion of a native nucleic acid that isidentifiable as distinct. A label can be a known sequence. A label cancomprise a junction of nucleic acid sequences, for example a junction ofa native and non-native sequence. As used herein, the term “label” canbe used interchangeably with the terms, “index”, “tag,” or “label-tag.”Labels can convey information. For example, in various embodiments,labels can be used to determine an identity of a sample, a source of asample, an identity of a cell, and/or a target.

As used herein, the term “non-depleting reservoirs” can refer to a poolof stochastic barcodes made up of many different labels. A non-depletingreservoir can comprise large numbers of different stochastic barcodessuch that when the non-depleting reservoir is associated with a pool oftargets each target is likely to be associated with a unique stochasticbarcode. The uniqueness of each labeled target molecule can bedetermined by the statistics of random choice, and depends on the numberof copies of identical target molecules in the collection compared tothe diversity of labels. The size of the resulting set of labeled targetmolecules can be determined by the stochastic nature of the barcodingprocess, and analysis of the number of stochastic barcodes detected thenallows calculation of the number of target molecules present in theoriginal collection or sample. When the ratio of the number of copies ofa target molecule present to the number of unique stochastic barcodes islow, the labeled target molecules are highly unique (i.e. there is avery low probability that more than one target molecule will have beenlabeled with a given label).

As used herein, a “nucleic acid” can generally refer to a polynucleotidesequence, or fragment thereof. A nucleic acid can comprise nucleotides.A nucleic acid can be exogenous or endogenous to a cell. A nucleic acidcan exist in a cell-free environment. A nucleic acid can be a gene orfragment thereof. A nucleic acid can be DNA. A nucleic acid can be RNA.A nucleic acid can comprise one or more analogs (e.g. altered backbone,sugar, or nucleobase). Some non-limiting examples of analogs include:5-bromouracil, peptide nucleic acid, xeno nucleic acid, morpholinos,locked nucleic acids, glycol nucleic acids, threose nucleic acids,dideoxynucleotides, cordycepin, 7-deaza-GTP, fluorophores (e.g.rhodamine or fluorescein linked to the sugar), thiol containingnucleotides, biotin linked nucleotides, fluorescent base analogs, CpGislands, methyl-7-guanosine, methylated nucleotides, inosine,thiouridine, pseudouridine, dihydrouridine, queuosine, and wyosine.“Nucleic acid”, “polynucleotide, “target polynucleotide”, and “targetnucleic acid” can be used interchangeably.

A nucleic acid can comprise one or more modifications (e.g., a basemodification, a backbone modification), to provide the nucleic acid witha new or enhanced feature (e.g., improved stability). A nucleic acid cancomprise a nucleic acid affinity tag. A nucleoside can be a base-sugarcombination. The base portion of the nucleoside can be a heterocyclicbase. The two most common classes of such heterocyclic bases are thepurines and the pyrimidines. Nucleotides can be nucleosides that furtherinclude a phosphate group covalently linked to the sugar portion of thenucleoside. For those nucleosides that include a pentofuranosyl sugar,the phosphate group can be linked to the 2′, the 3′, or the 5′ hydroxylmoiety of the sugar. In forming nucleic acids, the phosphate groups cancovalently link adjacent nucleosides to one another to form a linearpolymeric compound. In turn, the respective ends of this linearpolymeric compound can be further joined to form a circular compound;however, linear compounds are generally suitable. In addition, linearcompounds may have internal nucleotide base complementarity and maytherefore fold in a manner as to produce a fully or partiallydouble-stranded compound. Within nucleic acids, the phosphate groups cancommonly be referred to as forming the internucleoside backbone of thenucleic acid. The linkage or backbone can be a 3′ to 5′ phosphodiesterlinkage.

A nucleic acid can comprise a modified backbone and/or modifiedinternucleoside linkages. Modified backbones can include those thatretain a phosphorus atom in the backbone and those that do not have aphosphorus atom in the backbone. Suitable modified nucleic acidbackbones containing a phosphorus atom therein can include, for example,phosphorothioates, chiral phosphorothioates, phosphorodithioates,phosphotriesters, aminoalkyl phosphotriesters, methyl and other alkylphosphonate such as 3′-alkylene phosphonates, 5′-alkylene phosphonates,chiral phosphonates, phosphinates, phosphoramidates including 3′-aminophosphoramidate and aminoalkyl phosphoramidates, phosphorodiamidates,thionophosphoramidates, thionoalkylphosphonates,thionoalkylphosphotriesters, selenophosphates, and boranophosphateshaving normal 3′-5′ linkages, 2′-5′ linked analogs, and those havinginverted polarity wherein one or more internucleotide linkages is a 3′to 3′, a 5′ to 5′ or a 2′ to 2′ linkage.

A nucleic acid can comprise polynucleotide backbones that are formed byshort chain alkyl or cycloalkyl internucleoside linkages, mixedheteroatom and alkyl or cycloalkyl internucleoside linkages, or one ormore short chain heteroatomic or heterocyclic internucleoside linkages.These can include those having morpholino linkages (formed in part fromthe sugar portion of a nucleoside); siloxane backbones; sulfide,sulfoxide and sulfone backbones; formacetyl and thioformacetylbackbones; methylene formacetyl and thioformacetyl backbones; riboacetylbackbones; alkene containing backbones; sulfamate backbones;methyleneimino and methylenehydrazino backbones; sulfonate andsulfonamide backbones; amide backbones; and others having mixed N, O, Sand CH2 component parts.

A nucleic acid can comprise a nucleic acid mimetic. The term “mimetic”can be intended to include polynucleotides wherein only the furanosering or both the furanose ring and the internucleotide linkage arereplaced with non-furanose groups, replacement of only the furanose ringcan also be referred as being a sugar surrogate. The heterocyclic basemoiety or a modified heterocyclic base moiety can be maintained forhybridization with an appropriate target nucleic acid. One such nucleicacid can be a peptide nucleic acid (PNA). In a PNA, the sugar-backboneof a polynucleotide can be replaced with an amide containing backbone,in particular an aminoethylglycine backbone. The nucleotides can beretained and are bound directly or indirectly to aza nitrogen atoms ofthe amide portion of the backbone. The backbone in PNA compounds cancomprise two or more linked aminoethylglycine units which gives PNA anamide containing backbone. The heterocyclic base moieties can be bounddirectly or indirectly to aza nitrogen atoms of the amide portion of thebackbone.

A nucleic acid can comprise a morpholino backbone structure. Forexample, a nucleic acid can comprise a 6-membered morpholino ring inplace of a ribose ring. In some of these embodiments, aphosphorodiamidate or other non-phosphodiester internucleoside linkagecan replace a phosphodiester linkage.

A nucleic acid can comprise linked morpholino units (i.e. morpholinonucleic acid) having heterocyclic bases attached to the morpholino ring.Linking groups can link the morpholino monomeric units in a morpholinonucleic acid. Non-ionic morpholino-based oligomeric compounds can haveless undesired interactions with cellular proteins. Morpholino-basedpolynucleotides can be nonionic mimics of nucleic acids. A variety ofcompounds within the morpholino class can be joined using differentlinking groups. A further class of polynucleotide mimetic can bereferred to as cyclohexenyl nucleic acids (CeNA). The furanose ringnormally present in a nucleic acid molecule can be replaced with acyclohexenyl ring. CeNA DMT protected phosphoramidite monomers can beprepared and used for oligomeric compound synthesis usingphosphoramidite chemistry. The incorporation of CeNA monomers into anucleic acid chain can increase the stability of a DNA/RNA hybrid. CeNAoligoadenylates can form complexes with nucleic acid complements withsimilar stability to the native complexes. A further modification caninclude Locked Nucleic Acids (LNAs) in which the 2′-hydroxyl group islinked to the 4′ carbon atom of the sugar ring thereby forming a 2′-C,4′-C-oxymethylene linkage thereby forming a bicyclic sugar moiety. Thelinkage can be a methylene (—CH2-), group bridging the 2′ oxygen atomand the 4′ carbon atom wherein n is 1 or 2. LNA and LNA analogs candisplay very high duplex thermal stabilities with complementary nucleicacid (Tm=+3 to +10° C.), stability towards 3′-exonucleolytic degradationand good solubility properties.

A nucleic acid may also include nucleobase (often referred to simply as“base”) modifications or substitutions. As used herein, “unmodified” or“natural” nucleobases can include the purine bases, (e.g. adenine (A)and guanine (G)), and the pyrimidine bases, (e.g. thymine (T), cytosine(C) and uracil (U)). Modified nucleobases can include other syntheticand natural nucleobases such as 5-methylcytosine (5-me-C),5-hydroxymethyl cytosine, xanthine, hypoxanthine, 2-aminoadenine,6-methyl and other alkyl derivatives of adenine and guanine, 2-propyland other alkyl derivatives of adenine and guanine, 2-thiouracil,2-thiothymine and 2-thiocytosine, 5-halouracil and cytosine, 5-propynyl(—C═C—CH3) uracil and cytosine and other alkynyl derivatives ofpyrimidine bases, 6-azo uracil, cytosine and thymine, 5-uracil(pseudouracil), 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl,8-hydroxyl and other 8-substituted adenines and guanines, 5-haloparticularly 5-bromo, 5-trifluoromethyl and other 5-substituted uracilsand cytosines, 7-methylguanine and 7-methyladenine, 2-F-adenine,2-aminoadenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine and7-deazaadenine and 3-deazaguanine and 3-deazaadenine. Modifiednucleobases can include tricyclic pyrimidines such as phenoxazinecytidine (1H-pyrimido(5,4-b)(1,4)benzoxazin-2(3H)-one), phenothiazinecytidine (1H-pyrimido(5,4-b)(1,4)benzothiazin-2(3H)-one), G-clamps suchas a substituted phenoxazine cytidine (e.g.9-(2-aminoethoxy)-H-pyrimido(5,4-(b) (1,4)benzoxazin-2(3H)-one),phenothiazine cytidine (1H-pyrimido(5,4-b)(1,4) benzothiazin-2(3H)-one),G-clamps such as a substituted phenoxazine cytidine (e.g.9-(2-aminoethoxy)-H-pyrimido (5,4-(b) (1,4)benzoxazin-2(3H)-one),carbazole cytidine (2H-pyrimido(4,5-b)indol-2-one), pyridoindolecytidine (H-pyrido(3′,′:4,5)pyrrolo[2,3-d]pyrimidin-2-one).

As used herein, the term “sample” can refer to a composition comprisingtargets. Suitable samples for analysis by the disclosed methods,devices, and systems include cells, tissues, organs, or organisms.

As used herein, the term “sampling device” or “device” can refer to adevice which may take a section of a sample and/or place the section ona substrate. A sample device can refer to, for example, fluorescenceactivated cell sorting (FACS) machine, a cell sorter machine, a biopsyneedle, a biopsy device, a tissue sectioning device, a microfluidicdevice, a blade grid, and/or a microtome.

As used herein, the term “solid support” can refer to discrete solid orsemi-solid surfaces to which a plurality of stochastic barcodes may beattached. A solid support may encompass any type of solid, porous, orhollow sphere, ball, bearing, cylinder, or other similar configurationcomposed of plastic, ceramic, metal, or polymeric material (e.g.,hydrogel) onto which a nucleic acid may be immobilized (e.g., covalentlyor non-covalently). A solid support may comprise a discrete particlethat may be spherical (e.g., microspheres) or have a non-spherical orirregular shape, such as cubic, cuboid, pyramidal, cylindrical, conical,oblong, or disc-shaped, and the like. A solid support may be usedinterchangeably with the term “bead.” A solid support can refer to a“substrate.” A substrate can be a type of solid support. A substrate canrefer to a continuous solid or semi-solid surface on which the methodsof the disclosure may be performed. A substrate can refer to an array, acartridge, a chip, a device, and a slide, for example. As used herein,“solid support” and “substrate” are sometimes used interchangeably.

As used here, the term, “spatial label” can refer to a label which canbe associated with a position in space.

As used herein, the term “stochastic barcode” can refer to apolynucleotide sequence comprising labels. A stochastic barcode can be apolynucleotide sequence that can be used for stochastic labeling.Stochastic barcodes can be used to quantify targets within a sample.Stochastic barcodes can be used to control for errors which may occurafter a label is associated with a target. For example, a stochasticbarcode can be used to assess amplification or sequencing errors. Astochastic barcode associated with a target can be called a stochasticbarcode-target or stochastic barcode-tag-target.

As used herein, the term “stochastic barcoding” can refer to the randomlabeling (e.g., barcoding) of nucleic acids. Stochastic barcoding canutilize a recursive Poisson strategy to associate and quantify labelsassociated with targets.

As used herein, the term “target” can refer to a composition which canbe associated with a stochastic barcode. Exemplary suitable targets foranalysis by the disclosed methods, devices, and systems includeoligonucleotides, DNA, RNA, mRNA, microRNA, tRNA, and the like. Targetscan be single or double stranded. In some embodiments targets can beproteins. In some embodiments targets are lipids.

As used herein, the term “reverse transcriptases” can refer to a groupof enzymes having reverse transcriptase activity (i.e., that catalyzesynthesis of DNA from an RNA template). In general, such enzymesinclude, but are not limited to, retroviral reverse transcriptase,retrotransposon reverse transcriptase, retroplasmid reversetranscriptases, retron reverse transcriptases, bacterial reversetranscriptases, group II intron-derived reverse transcriptase, andmutants, variants or derivatives thereof. Non-retroviral reversetranscriptases include non-LTR retrotransposon reverse transcriptases,retroplasmid reverse transcriptases, retron reverse transciptases, andgroup II intron reverse transcriptases. Examples of group II intronreverse transcriptases include the Lactococcus lactis LI.LtrB intronreverse transcriptase, the Thermosynechococcus elongates TeI4c intronreverse transcriptase, or the Geobacillus stearothermophilus GsI-IICintron reverse transcriptase. Other classes of reverse transcriptasescan include many classes of non-retroviral reverse transcriptases (i.e.,retrons, group II introns, and diversity-generating retroelements amongothers).

As used herein, the term “template switching” can refer to the abilityof a reverse transcriptase to switch from an initial nucleic acidsequence template to the 3′ end of a new nucleic acid sequence templatehaving little or no complementarity to the 3′ end of the nucleic acidsynthesized from the initial template. Nucleic acid copies of a targetpolynucleotide can be made using template switching. Template switchingallows, e.g., a DNA copy to be prepared using a reverse transcriptasethat switches from an initial nucleic acid sequence template to the 3′end of a new nucleic acid sequence template having little or nocomplementarity to the 3′ end of the DNA synthesized from the initialtemplate, thereby allowing the synthesis of a continuous product DNAthat directly links an adaptor sequence to a target oligonucleotidesequence without ligation. Template switching can comprise ligation ofadaptor, homopolymer tailing (e.g., polyadenylation), random primer, oran oligonucleotide that the polymerase can associate with.

Stochastic Barcodes with Spatial Labels and Dimension Labels

Disclosed herein are methods, compositions, devices, systems, and kitsfor spatial stochastic barcoding. Some embodiments disclosed hereinprovide methods determining the number and spatial locations of aplurality of targets in a sample. The methods include, in someembodiments, stochastically barcoding the plurality of targets in thesample using a plurality of stochastic barcodes, wherein each of theplurality of stochastic barcodes include a spatial label and a molecularlabel; estimating the number of each of the plurality of targets usingthe molecular label; and identifying the spatial location of each of theplurality of targets using the spatial label. In some embodiments, themethod can be multiplexed. The sample can comprise a plurality of cellsand the plurality of targets can be associated with the plurality ofcells.

Disclosed here are methods for determining spatial locations of aplurality of targets in a sample. In some embodiments, the methodsinclude: stochastically barcoding the plurality of targets in the sampleat one or more time points using a plurality of stochastic barcodes,wherein each of the plurality of stochastic barcodes comprises a spatiallabel; and identifying the spatial location of each of the plurality oftargets using the spatial label. Stochastically barcoding the pluralityof targets in the sample using the plurality of stochastic barcodes caninclude stochastically barcoding the plurality of targets in the sampleat different time points using the plurality of stochastic barcodes.Each of the plurality of stochastic barcodes can include a dimensionlabel, and the dimension labels of the plurality of stochastic barcodesused for stochastic barcoding the plurality of targets at the differenttime points can be different. The dimension labels can correlate withthe different time points.

Spatial stochastic barcoding can refer to the stochastic barcoding of aplurality of target molecules in single cells to determine spatialorientation of the target molecules. As shown in FIG. 1, the disclosureprovides for a method for correlating information in real physical spacewith information in chemical space. A sample comprising a twodimensional or three-dimensional sample (e.g., a cell) 105 can bedivided into multiple sections, for example 110/111/112/113. In someembodiments, sections 110/111/112/113 can be physically divided, thenchemically divided based on the physical division. In some embodiments,sections 110/111/112/113 can be chemically divided without physicaldivision. In some embodiments, the sections 110/111/112/113 can bephysically separated 115 from the sample 105. Each section110/111/112/113 can be placed in a separate container 120 on a substrate125. The sections 110/111/112/113 in the substrate 125 can be subjectedto stochastic barcoding. Stochastic barcoding can comprise labelingdistinct targets in each section 110/111/112/113 with a differentbarcode. In some embodiments, the different barcode comprises a spatiallabel. The sections can be stochastically labeled, amplified, and/ordigitally counted, wherein the number of distinct targets can beestimated from the digital counting of different barcodes. Theinformation in the spatial label of the different barcode can correspondto a location on the sample 105. In this way, the method can be used todetermine the number of distinct targets in a sample 105 at distinctphysical locations.

The methods, devices and systems disclosed herein may be used for avariety of applications in basic research, biomedical research,environmental testing, and clinical diagnostics. Examples ofapplications for the disclosed methods devices and systems include, butare not limited to, genotyping, gene expression profiling, detection andidentification of rare cells, diagnosis of a disease or condition,determining prognosis for a disease or condition, determining a courseof treatment for a disease or condition, and monitoring the response totreatment for a disease or condition, and understanding biologicaldevelopment processes. For example, the methods of the disclosure can beused for whole transcriptome analysis, rare cell (e.g., circulatingtumor cell) analysis, chimeric antigen receptor T-cell (CAR-T) therapyanalysis (e.g., determining specific cells that respond to CAR-T therapyversus non-responders), and neuroscience (e.g., therapies anddiagnostics for, e.g., Autism, Schizophrenia, Bipolar disorder,Parkinson's disease, and Alzheimer's disease). In some embodiments, themethods can include treating the subject. Treating the subject caninclude administering a drug to the subject.

A stochastic barcode can refer to a polynucleotide sequence that may beused to stochastically label (e.g., barcode, tag) a target. A stochasticbarcode can comprise one or more labels. Exemplary labels can include auniversal label, a cellular label, a molecular label, a sample label, aplate label, a spatial label, and/or a pre-spatial label. FIG. 2illustrates an exemplary stochastic barcode with a spatial label of thedisclosure. A stochastic barcode 204 can comprise a 5′amine that maylink the stochastic barcode to a solid support 205. The stochasticbarcode can comprise a universal label, a dimension label, a spatiallabel, a cellular label, and/or a molecular label. The universal labelmay be 5′-most label. The molecular label may be the 3′-most label. Thespatial label, dimension label, and the cellular label may be in anyorder. In some embodiments, the universal label, the spatial label, thedimension label, the cellular label, and the molecular label are in anyorder. The stochastic barcode can comprise a target-binding region. Thetarget-binding region can interact with a target in a sample. The targetcan be, or comprise, ribonucleic acids (RNAs), messenger RNAs (mRNAs),microRNAs, small interfering RNAs (siRNAs), RNA degradation products,RNAs each comprising a poly(A) tail, and any combination thereof. Insome embodiments, the plurality of targets can include deoxyribonucleicacids (DNAs).

For example, a target-binding region can comprise an oligo(dT) sequencewhich can interact with poly(A) tails of mRNAs. In some embodiments, thelabels of the stochastic barcode (e.g., universal label, dimensionlabel, spatial label, cellular label, and molecular label) may beseparated by 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,18, 19, or 20 or more nucleotides.

In some embodiments, stochastically barcoding the plurality of targetsin the sample includes hybridizing the plurality of stochastic barcodeswith the plurality of targets to generate stochastically barcodedtargets, and at least one of the plurality of targets is hybridized toone of the plurality of stochastic barcodes. A portion or all of theplurality of targets can be hybridized to the plurality of stochasticbarcodes. For example, in some embodiments, each of the plurality oftargets is hybridized to one of the plurality of stochastic barcodes. Insome embodiments, each of at least two, three, four, five, ten, twenty,fifty, one hundred, or one thousand of the plurality of targets ishybridized to one of the plurality of stochastic barcodes. A stochasticbarcode can comprise one or more universal labels. The one or moreuniversal labels can be the same for all stochastic barcodes in the setof stochastic barcodes attached to a given solid support. In someembodiments, the one or more universal labels can be the same for allstochastic barcodes attached to a plurality of beads. In someembodiments, a universal label can comprise a nucleic acid sequence thatis capable of hybridizing to a sequencing primer. Sequencing primers canbe used for sequencing stochastic barcodes comprising a universal label.Sequencing primers (e.g., universal sequencing primers) can comprisesequencing primers associated with high-throughput sequencing platforms.In some embodiments, a universal label can comprise a nucleic acidsequence that is capable of hybridizing to a PCR primer. In someembodiments, the universal label can comprise a nucleic acid sequencethat is capable of hybridizing to a sequencing primer and a PCR primer.The nucleic acid sequence of the universal label that is capable ofhybridizing to a sequencing or PCR primer can be referred to as a primerbinding site. A universal label can comprise a sequence that can be usedto initiate transcription of the stochastic barcode. A universal labelcan comprise a sequence that can be used for extension of the stochasticbarcode or a region within the stochastic barcode. A universal label canbe at least about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50 ormore nucleotides in length. A universal label can comprise at leastabout 10 nucleotides. A universal label can be at most about 1, 2, 3, 4,5, 10, 15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length. Insome embodiments, a cleavable linker or modified nucleotide can be partof the universal label sequence to enable the stochastic barcode to becleaved off from the support.

A stochastic barcode can comprise a dimension label. A dimension labelcan comprise a nucleic acid sequence that provides information about adimension in which the stochastic labeling occurred. For example, adimension label can provide information about the time at which a targetwas stochastically barcoded. A dimension label can be associated with atime of stochastic barcoding in a sample. A dimension label can beactivated at the time of stochastic labeling. Different dimension labelscan be activated at different times. The dimension label providesinformation about the order in which targets, groups of targets, and/orsamples were stochastically barcoded. For example, a population of cellscan be stochastically barcoded at the G0 phase of the cell cycle. Thecells can be pulsed again with stochastic barcodes at the G1 phase ofthe cell cycle. The cells can be pulsed again with stochastic barcodesat the S phase of the cell cycle, and so on. Stochastic barcodes at eachpulse (e.g., each phase of the cell cycle), can comprise differentdimension labels. In this way, the dimension label provides informationabout which targets were labelled at which phase of the cell cycle.Dimension labels can interrogate many different biological times.Exemplary biological times can include, but are not limited to, the cellcycle, transcription (e.g., transcription initiation), and transcriptdegradation. In another example, a sample (e.g., a cell, a population ofcells) can be stochastically labeled before and/or after treatment witha drug and/or therapy. The changes in the number of copies of distincttargets can be indicative of the sample's response to the drug and/ortherapy.

A dimension label can be activatable. An activatable dimension label canbe activated at a specific time point. The activatable label can be, forexample, constitutively activated (e.g., not turned off). Theactivatable dimension label can be, for example, reversibly activated(e.g., the activatable dimension label can be turned on and turned off).The dimension label can be, for example, reversibly activatable at least1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more times. The dimension label canbe reversibly activatable, for example, at least 1, 2, 3, 4, 5, 6, 7, 8,9, or 10 or more times. In some embodiments, the dimension label can beactivated with fluorescence, light, a chemical event (e.g., cleavage,ligation of another molecule, addition of modifications (e.g.,pegylated, sumoylated, acetylated, methylated, deacetylated,demethylated), a photochemical event (e.g., photocaging), andintroduction of a non-natural nucleotide.

The dimension label can, in some embodiments, be identical for allstochastic barcodes attached to a given solid support (e.g., bead), butdifferent for different solid supports (e.g., beads). In someembodiments, at least 60%, 70%, 80%, 85%, 90%, 95%, 97%, 99% or 100% ofstochastic barcodes on the same solid support can comprise the samedimension label. In some embodiments, at least 60% of stochasticbarcodes on the same solid support can comprise the same dimensionlabel. In some embodiments, at least 95% of stochastic barcodes on thesame solid support can comprise the same dimension label.

There can be as many as 10⁶ or more unique dimension label sequencesrepresented in a plurality of solid supports (e.g., beads). A dimensionlabel can be at least about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40,45, 50 or more nucleotides in length. A dimension label can be at mostabout 300, 200, 100, 90, 80, 70, 60, 50, 40, 30, 20, 15, 12, 10, 9, 8,7, 6, 5, 4 or fewer or more nucleotides in length. A dimension label cancomprise between about 5 to about 200 nucleotides. A dimension label cancomprise between about 10 to about 150 nucleotides. A dimension labelcan comprise between about 20 to about 125 nucleotides in length.

A stochastic barcode can comprise a spatial label. A spatial label cancomprise a nucleic acid sequence that provides information about thespatial orientation of a target molecule which is associated with thestochastic barcode. A spatial label can be associated with a coordinatein a sample. The coordinate can be a fixed coordinate. For example acoordinate can be fixed in reference to a substrate. A spatial label canbe in reference to a two or three-dimensional grid. A coordinate can befixed in reference to a landmark. The landmark can be identifiable inspace. A landmark can be a structure which can be imaged. A landmark canbe a biological structure, for example an anatomical landmark. Alandmark can be a cellular landmark, for instance an organelle. Alandmark can be a non-natural landmark such as a structure with anidentifiable identifier such as a color code, bar code, magneticproperty, fluorescents, radioactivity, or a unique size or shape. Aspatial label can be associated with a physical partition (e.g. a well,a container, or a droplet). In some embodiments, multiple spatial labelsare used together to encode one or more positions in space.

The spatial label can be identical for all stochastic barcodes attachedto a given solid support (e.g., bead), but different for different solidsupports (e.g., beads). In some embodiments, at least 60%, 70%, 80%,85%, 90%, 95%, 97%, 99% or 100% of stochastic barcodes on the same solidsupport can comprise the same spatial label. In some embodiments, atleast 60% of stochastic barcodes on the same solid support can comprisethe same spatial label. In some embodiments, at least 95% of stochasticbarcodes on the same solid support can comprise the same spatial label.

There can be as many as 10⁶ or more unique spatial label sequencesrepresented in a plurality of solid supports (e.g., beads). A spatiallabel can be at least about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40,45, 50 or more nucleotides in length. A spatial label can be at mostabout 300, 200, 100, 90, 80, 70, 60, 50, 40, 30, 20, 15, 12, 10, 9, 8,7, 6, 5, 4 or fewer or more nucleotides in length. A spatial label cancomprise between about 5 to about 200 nucleotides. A spatial label cancomprise between about 10 to about 150 nucleotides. A spatial label cancomprise between about 20 to about 125 nucleotides in length.

Stochastic barcodes can comprise a cellular label. A cellular label cancomprise a nucleic acid sequence that provides information fordetermining which target nucleic acid originated from which cell. Insome embodiments, the cellular label is identical for all stochasticbarcodes attached to a given solid support (e.g., bead), but differentfor different solid supports (e.g., beads). In some embodiments, atleast 60%, 70%, 80%, 85%, 90%, 95%, 97%, 99% or 100% of stochasticbarcodes on the same solid support can comprise the same cellular label.In some embodiments, at least 60% of stochastic barcodes on the samesolid support can comprise the same cellular label. In some embodiment,at least 95% of stochastic barcodes on the same solid support cancomprise the same cellular label.

There can be as many as 10⁶ or more unique cellular label sequencesrepresented in a plurality of solid supports (e.g., beads). A cellularlabel can be at least about 1, 2, 3, 4, 5, 10, 15, 20, 25, 30, 35, 40,45, 50 or more nucleotides in length. A cellular label can be at mostabout 300, 200, 100, 90, 80, 70, 60, 50, 40, 30, 20, 15, 12, 10, 9, 8,7, 6, 5, 4 or fewer or more nucleotides in length. A cellular label cancomprise between about 5 to about 200 nucleotides. A cellular label cancomprise between about 10 to about 150 nucleotides. A cellular label cancomprise between about 20 to about 125 nucleotides in length.

The cellular label can further comprise a unique set of nucleic acidsub-sequences of defined length, e.g. 7 nucleotides each (equivalent tothe number of bits used in some Hamming error correction codes), whichcan be designed to provide error correction capability. The set of errorcorrection sub-sequences comprise 7 nucleotide sequences can be designedsuch that any pairwise combination of sequences in the set exhibits adefined “genetic distance” (or number of mismatched bases), for example,a set of error correction sub-sequences can be designed to exhibit agenetic distance of 3 nucleotides. In this case, review of the errorcorrection sequences in the set of sequence data for labeled targetnucleic acid molecules (described more fully below) can allow one todetect or correct amplification or sequencing errors. In someembodiments, the length of the nucleic acid sub-sequences used forcreating error correction codes can vary, for example, they can be 3nucleotides, 7 nucleotides, 15 nucleotides, or 31 nucleotides in length.In some embodiments, nucleic acid sub-sequences of other lengths can beused for creating error correction codes.

In some embodiments, stochastic barcodes can comprise a molecular label.A molecular label can comprise a nucleic acid sequence that providesidentifying information for the specific type of target nucleic acidspecies hybridized to the stochastic barcode. A molecular label cancomprise a nucleic acid sequence that provides a counter for thespecific occurrence of the target nucleic acid species hybridized to thestochastic barcode (e.g., target-binding region). In some embodiments, adiverse set of molecular labels are attached to a given solid support(e.g., bead). In some embodiments, there can be as many as 10⁵ or moreunique molecular label sequences attached to a given solid support(e.g., bead). In some embodiments, there can be as many as 10⁴ or moreunique molecular label sequences attached to a given solid support(e.g., bead). In some embodiments, there can be as many as 10³ or moreunique molecular label sequences attached to a given solid support(e.g., bead). In some embodiments, there can be as many as 10² or moreunique molecular label sequences attached to a given solid support(e.g., bead). A molecular label can be at least about 1, 2, 3, 4, 5, 10,15, 20, 25, 30, 35, 40, 45, 50 or more nucleotides in length. Amolecular label can be at most about 300, 200, 100, 90, 80, 70, 60, 50,40, 30, 20, 15, 12, 10, 9, 8, 7, 6, 5, 4 or fewer nucleotides in length.

Stochastic barcodes can comprise a target binding region. In someembodiments, the target binding regions can comprise a nucleic acidsequence that hybridizes specifically to a target (e.g. target nucleicacid, target molecule, e.g., a cellular nucleic acid to be analyzed),for example to a specific gene sequence. In some embodiments, a targetbinding region can comprise a nucleic acid sequence that can attach(e.g., hybridize) to a specific location of a specific target nucleicacid. In some embodiments, the target binding region can comprise anucleic acid sequence that is capable of specific hybridization to arestriction enzyme site overhang (e.g. an EcoRI sticky-end overhang).The stochastic barcode can then ligate to any nucleic acid moleculecomprising a sequence complementary to the restriction site overhang.

In some embodiments, a target binding region can comprise a non-specifictarget nucleic acid sequence. A non-specific target nucleic acidsequence can refer to a sequence that can bind to multiple targetnucleic acids, independent of the specific sequence of the targetnucleic acid. For example, target binding region can comprise a randommultimer sequence, or an oligo-dT sequence that hybridizes to thepoly(A) tail on mRNA molecules. A random multimer sequence can be, forexample, a random dimer, trimer, quatramer, pentamer, hexamer, septamer,octamer, nonamer, decamer, or higher multimer sequence of any length. Insome embodiments, the target binding region is the same for allstochastic barcodes attached to a given bead. In some embodiments, thetarget binding regions for the plurality of stochastic barcodes attachedto a given bead can comprise two or more different target bindingsequences. A target binding region can be at least about 5, 10, 15, 20,25, 30, 35, 40, 45, 50 or more nucleotides in length. A target bindingregion can be at most about 5, 10, 15, 20, 25, 30, 35, 40, 45, 50 ormore nucleotides in length.

A stochastic barcode can comprise an orientation property which can beused to orient (e.g., align) the stochastic barcodes. A stochasticbarcode can comprise a moiety for isoelectric focusing. Differentstochastic barcodes can comprise different isoelectric focusing points.When these stochastic barcodes are introduced to a sample, the samplecan undergo isoelectric focusing in order to orient the stochasticbarcodes into a known way. In this way, the orientation property can beused to develop a known map of stochastic barcodes in a sample.Exemplary orientation properties can include, electrophoretic mobility(e.g., based on size of the stochastic barcode), isoelectric point,spin, conductivity, and/or self-assembly. For example, stochasticbarcodes with an orientation property of self-assembly, canself-assemble into a specific orientation (e.g., nucleic acidnanostructure) upon activation.

A stochastic barcode can comprise an affinity property. A spatial labelcan comprise an affinity property. An affinity property can include achemical and/or biological moiety that can facilitate binding of thestochastic barcode to another entity (e.g., cell receptor). For example,an affinity property can comprise an antibody. An antibody can bespecific for a specific moiety (e.g., receptor) on a sample. An antibodycan guide the stochastic barcode to a specific cell type or molecule.Targets at and/or near the specific cell type or molecule can bestochastically labeled. An affinity property can also provide spatialinformation in addition to the nucleotide sequence of the spatial labelbecause the antibody can guide the stochastic barcode to a specificlocation. An antibody can be a therapeutic antibody. An antibody can bea monoclonal antibody. An antibody can be a polyclonal antibody. Anantibody can be humanized. An antibody can be chimeric. An antibody canbe a naked antibody. An antibody can be a fusion antibody.

An antibody, can refer to a full-length (i.e., naturally occurring orformed by normal immunoglobulin gene fragment recombinatorial processes)immunoglobulin molecule (e.g., an IgG antibody) or an immunologicallyactive (i.e., specifically binding) portion of an immunoglobulinmolecule, like an antibody fragment.

An antibody can be an antibody fragment. An antibody fragment can be aportion of an antibody such as F(ab′)2, Fab′, Fab, Fv, sFv and the like.An antibody fragment can bind with the same antigen that is recognizedby the full-length antibody. An antibody fragment can include isolatedfragments consisting of the variable regions of antibodies, such as the“Fv” fragments consisting of the variable regions of the heavy and lightchains and recombinant single chain polypeptide molecules in which lightand heavy variable regions are connected by a peptide linker (“scFvproteins”). Exemplary antibodies can include, but are not limited to,antibodies for antibodies for cancer cells, antibodies for viruses,antibodies that bind to cell surface receptors (CD8, CD34, CD45), andtherapeutic antibodies.

Solid Supports

The stochastic barcodes disclosed herein can be associated to (e.g.,attached to) a solid support (e.g., a bead). In some embodiments,stochastically barcoding the plurality of targets in the sample can beperformed with a solid support including a plurality of syntheticparticles associated with the plurality of stochastic barcodes. In someembodiments, the solid support can include a plurality of syntheticparticles associated with the plurality of stochastic barcodes. Thespatial labels of the plurality of stochastic barcodes on differentsolid supports can differ by at least one nucleotide. The solid supportcan, for example, include the plurality of stochastic barcodes in twodimensions or three dimensions. The synthetic particles can be beads.The beads can be silica gel beads, controlled pore glass beads, magneticbeads, Dynabeads, Sephadex/Sepharose beads, cellulose beads, polystyrenebeads, or any combination thereof. The solid support can include apolymer, a matrix, a hydrogel, a needle array device, an antibody, orany combination thereof. In some embodiments, the solid supports can befree floating. In some embodiments, the solid supports can be embeddedin a semi-solid or solid array. The stochastic barcodes may not beassociated with solid supports. The stochastic barcodes can beindividual nucleotides. The stochastic barcodes can be associated with asubstrate.

As used herein, the terms “tethered”, “attached”, and “immobilized” areused interchangeably, and can refer to covalent or non-covalent meansfor attaching stochastic barcodes to a solid support. Any of a varietyof different solid supports can be used as solid supports for attachingpre-synthesized stochastic barcodes or for in situ solid-phase synthesisof stochastic barcode.

In some embodiments, a solid support is a bead. A bead can encompass oneor more types of solid, porous, or hollow sphere, ball, bearing,cylinder, or other similar configuration which a nucleic acid can beimmobilized (e.g., covalently or non-covalently). The bead can be, forexample, composed of plastic, ceramic, metal, polymeric material, or anycombination thereof. A bead can be, or comprise, a discrete particlethat is spherical (e.g., microspheres) or have a non-spherical orirregular shape, such as cubic, cuboid, pyramidal, cylindrical, conical,oblong, or disc-shaped, and the like. In some embodiments, a bead can benon-spherical in shape.

Beads can comprise a variety of materials including, but not limited to,paramagnetic materials (e.g. magnesium, molybdenum, lithium, andtantalum), superparamagnetic materials (e.g. ferrite (Fe₃O₄; magnetite)nanoparticles), ferromagnetic materials (e.g. iron, nickel, cobalt, somealloys thereof, and some rare earth metal compounds), ceramic, plastic,glass, polystyrene, silica, methylstyrene, acrylic polymers, titanium,latex, sepharose, agarose, hydrogel, polymer, cellulose, nylon, and anycombination thereof.

The diameter of the beads can vary, for example, be at least about 100nm, 500 nm, 1 μm, 5 μm, 10 μm, 20 μm, 25 μm, 30 μm, 35 μm, 40 μm, 45 μmor 50 μm. In some embodiments, the diameter of the beads can be at mostabout 100 nm, 500 nm, 1 μm, 5 μm, 10 μm, 20 μm, 25 μm, 30 μm, 35 μm, 40μm, 45 μm or 50 μm. In some embodiments, the diameter of the bead can berelated to the diameter of the wells of the substrate. For example, thediameter of the bead can be at least 10%, 20%, 30%, 40%, 50%, 60%, 70%,80%, 90% or 100% longer or shorter than the diameter of the well. Insome embodiments, the diameter of the bead can be at most 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90% or 100% longer or shorter than the diameterof the well. The diameter of the bead can be related to the diameter ofa cell (e.g., a single cell entrapped by a well of the substrate). Thediameter of the bead can be at least 10%, 20%, 30%, 40%, 50%, 60%, 70%,80%, 90%, 100%, 150%, 200%, 250%, or 300% or more longer or shorter thanthe diameter of the cell. The diameter of the bead can be at most 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 150%, 200%, 250%, or 300%or more longer or shorter than the diameter of the cell.

A bead can be attached to and/or embedded in a substrate. A bead can beattached to and/or embedded in a gel, hydrogel, polymer and/or matrix.The spatial position of a bead within a substrate (e.g., gel, matrix,scaffold, or polymer) can be identified using the spatial label presenton the stochastic barcode on the bead which can serve as a locationaddress.

Examples of beads can include, but are not limited to, streptavidinbeads, agarose beads, magnetic beads, Dynabeads®, MACS® microbeads,antibody conjugated beads (e.g., anti-immunoglobulin microbeads),protein A conjugated beads, protein G conjugated beads, protein A/Gconjugated beads, protein L conjugated beads, oligo(dT) conjugatedbeads, silica beads, silica-like beads, anti-biotin microbeads,anti-fluorochrome microbeads, and BcMag™ Carboxyl-Terminated MagneticBeads.

A bead can be associated with (e.g. impregnated with) quantum dots orfluorescent dyes to make it fluorescent in one fluorescence opticalchannel or multiple optical channels. A bead can be associated with ironoxide or chromium oxide to make it paramagnetic or ferromagnetic. Beadscan be identifiable. For example, a bead can be imaged using a camera. Abead can have a detectable code associated with the bead. For example, abead can comprise a stochastic barcode. A bead can change size, forexample due to swelling in an organic or inorganic solution. A bead canbe hydrophobic. A bead can be hydrophilic. A bead can be biocompatible.

A solid support (e.g., bead) can be visualized. The solid support cancomprise a visualizing tag (e.g., fluorescent dye). A solid support(e.g., bead) can be etched with an identifier (e.g., a number). Theidentifier can be visualized through imaging the beads.

A solid support can refer to an insoluble, semi-soluble, or insolublematerial. A solid support can be referred to as “functionalized” when itincludes a linker, a scaffold, a building block, or other reactivemoiety attached thereto, whereas a solid support can be“nonfunctionalized” when it lack such a reactive moiety attachedthereto. The solid support can be employed free in solution, such as ina microtiter well format; in a flow-through format, such as in a column;or in a dipstick.

The solid support can comprise a membrane, paper, plastic, coatedsurface, flat surface, glass, slide, chip, or any combination thereof. Asolid support can take the form of resins, gels, microspheres, or othergeometric configurations. A solid support can comprise silica chips,synthetic particles, nanoparticles, plates, and arrays. Solid supportscan include beads (e.g., silica gel, controlled pore glass, magneticbeads, Dynabeads, Wang resin; Merrifield resin, Sephadex/Sepharosebeads, cellulose beads, polystyrene beads etc.), capillaries, flatsupports such as glass fiber filters, glass surfaces, metal surfaces(steel, gold silver, aluminum, silicon and copper), glass supports,plastic supports, silicon supports, chips, filters, membranes, microwellplates, slides, or the like. plastic materials including multi-wellplates or membranes (e.g., formed of polyethylene, polypropylene,polyamide, polyvinylidene difluoride), wafers, combs, pins or needles(e.g., arrays of pins suitable for combinatorial synthesis or analysis)or beads in an array of pits or nanoliter wells of flat surfaces such aswafers (e.g., silicon wafers), wafers with pits with or without filterbottoms.

In some embodiments stochastic barcodes of the disclosure can beattached to a polymer matrix (e.g., gel, hydrogel). The polymer matrixcan be able to permeate intracellular space (e.g., around organelles).The polymer matrix can able to be pumped throughout the circulatorysystem.

A solid support can be a biological molecule. For example a solidsupport can be a nucleic acid, a protein, an antibody, a histone, acellular compartment, a lipid, a carbohydrate, and the like. Solidsupports that are biological molecules can be amplified, translated,transcribed, degraded, and/or modified (e.g., pegylated, sumoylated). Asolid support that is a biological molecule can provide spatial and timeinformation in addition to the spatial label that is attached to thebiological molecule. For example, a biological molecule can comprise afirst confirmation when unmodified, but can change to a secondconfirmation when modified. The different conformations can exposestochastic barcodes of the disclosure to targets. For example, abiological molecule can comprise stochastic barcodes that areinaccessible due to folding of the biological molecule. Uponmodification of the biological molecule (e.g., acetylation), thebiological molecule can change conformation to expose the stochasticlabels. The timing of the modification can provide another timedimension to the method of stochastic barcoding of the disclosure.

In another example, the biological molecule comprising stochasticbarcodes of the disclosure can be located in the cytoplasm of a cell.Upon activation, the biological molecule can move to the nucleus,whereupon stochastic barcoding can take place. In this way, modificationof the biological molecule can encode additional space-time informationfor the targets identified by the stochastic barcodes.

A dimension label can provide information about space-time of abiological event (e.g., cell division). For example, a dimension labelcan be added to a first cell, the first cell can divide generating asecond daughter cell, the second daughter cell can comprise all, some ornone of the dimension labels. The dimension labels can be activated inthe original cell and the daughter cell. In this way, the dimensionlabel can provide information about time of stochastic barcoded indistinct spaces.

Microarrays

In some embodiments, a solid support/substrate can refer to amicroarray. A microarray can comprise a plurality of polymers, e.g.,oligomers, synthesized in situ or pre-synthesized and deposited on asubstrate in an array pattern. Microarrays of oligomers manufactured bysolid-phase DNA synthesis can have oligomer densities approaching106/micron2. As used herein, the support-bound oligomers can be referredto as called “probes”, which function to bind or hybridize with a sampleof DNA or RNA material under test. However, the terms can be usedinterchangeably wherein the surface-bound oligonucleotides as targetsand the solution sample of nucleic acids as probes. Further, someinvestigators bind the target sample under test to the microarraysubstrate and put the oligomer probes in solution for hybridization.Either of the “target” or “probes” can be the one that is to beevaluated by the other (thus, either one could be an unknown mixture ofpolynucleotides to be evaluated by binding with the other). All of theseiterations are within the scope of this discussion herein. For thepurpose of simplicity only, herein the probe is the surface-boundoligonucleotide of known sequence and the target is the moiety in amobile phase (typically fluid), to be detected by the surface-boundprobes. The plurality of probes and/or targets in each location in thearray can be referred to as a “nucleic acid feature” or “feature.” Afeature is defined as a locus onto which a large number of probes and/ortargets all having the same nucleotide sequence are immobilized.

Depending on the make-up of the target sample, hybridization of probefeatures may or may not occur at all probe feature locations and canoccur to varying degrees at the different probe feature locations.

An “array” can refer to an intentionally created collection of moleculeswhich can be prepared either synthetically or biosynthetically. Themolecules in the array can be identical or different from each other.The array can assume a variety of formats, e.g., libraries of solublemolecules; libraries of compounds tethered to resin beads, silica chips,or other solid supports. Array Plate or a Plate a body having aplurality of arrays in which each array can be separated from the otherarrays by a physical barrier resistant to the passage of liquids andforming an area or space, referred to as a well.

The density of the microarrays can be higher than 500, 5000, 50000, or500,000 different probes per cm². The feature size of the probes can besmaller than 500, 150, 25, 9, or 1 μm². The locations of the probes canbe determined or decipherable. For example, in some arrays, the specificlocations of the probes are known before binding assays. In some otherarrays, the specific locations of the probes are unknown until after theassays. The probes can be immobilized on a substrate, optionally, via alinker, beads, etc.

The array can comprise features made up of oligo(dT) probes. The arraycan comprise features made up of gene-specific probes. In someembodiments, the array is a microarray. In some embodiments, the arrayis an array of solid supports (e.g., beads). In some embodiments, thearray is planar. In some embodiments, the array has topographicalfeatures.

Substrates

A substrate can refer to a type of solid support. A substrate can referto a solid support that can comprise stochastic barcodes of thedisclosure. A substrate can comprise a plurality of microwells. Amicrowell can comprise a small reaction chamber of defined volume. Amicrowell can entrap one or more cells. A microwell can entrap only onecell. A microwell can entrap one or more solid supports. A microwell canentrap only one solid support. In some embodiments, a microwell entrapsa single cell and a single solid support (e.g., bead).

The microwells of the array can be fabricated in a variety of shapes andsizes. Appropriate well geometries can include, but are not limited to,cylindrical, conical, hemispherical, rectangular, or polyhedral (e.g.,three dimensional geometries comprised of several planar faces, forexample, hexagonal columns, octagonal columns, inverted triangularpyramids, inverted square pyramids, inverted pentagonal pyramids,inverted hexagonal pyramids, or inverted truncated pyramids). Themicrowells can comprise a shape that combines two or more of thesegeometries. For example, a microwell can be partly cylindrical, with theremainder having the shape of an inverted cone. A microwell can includetwo side-by-side cylinders, one of larger diameter (e.g. thatcorresponds roughly to the diameter of the beads) than the other (e.g.that corresponds roughly to the diameter of the cells), that areconnected by a vertical channel (that is, parallel to the cylinder axes)that extends the full length (depth) of the cylinders. The opening ofthe microwell can be at the upper surface of the substrate. The openingof the microwell can be at the lower surface of the substrate. Theclosed end (or bottom) of the microwell can be flat. The closed end (orbottom) of the microwell can have a curved surface (e.g., convex orconcave). The shape and/or size of the microwell can be determined basedon the types of cells or solid supports to be trapped within themicrowells.

Microwell dimensions can be characterized in terms of the diameter anddepth of the well. As used herein, the diameter of the microwell refersto the largest circle that can be inscribed within the planarcross-section of the microwell geometry. The diameter of the microwellscan range from about 1-fold to about 10-folds the diameter of the cellsor solid supports to be trapped within the microwells. The microwelldiameter can be at least 1-fold, at least 1.5-fold, at least 2-folds, atleast 3-folds, at least 4-folds, at least 5-folds, or at least 10-foldsthe diameter of the cells or solid supports to be trapped within themicrowells. The microwell diameter can be at most 10-folds, at most5-folds, at most 4-folds, at most 3-folds, at most 2-folds, at most1.5-fold, or at most 1-fold the diameter of the cells or solid supportsto be trapped within the microwells. The microwell diameter can be about2.5-folds the diameter of the cells or solid supports to be trappedwithin the microwells.

The diameter of the microwells can be specified in terms of absolutedimensions. The diameter of the microwells can range from about 5 toabout 50 micrometers. The microwell diameter can be at least 5micrometers, at least 10 micrometers, at least 15 micrometers, at least20 micrometers, at least 25 micrometers, at least 30 micrometers, atleast 35 micrometers, at least 40 micrometers, at least 45 micrometers,or at least 50 micrometers. The microwell diameter can be at most 50micrometers, at most 45 micrometers, at most 40 micrometers, at most 35micrometers, at most 30 micrometers, at most 25 micrometers, at most 20micrometers, at most 15 micrometers, at most 10 micrometers, or at most5 micrometers. The microwell diameter can be about 30 micrometers.

In some embodiments, the diameter of each microwell can be, or can beabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or a number betweenany two of these values, nanometer. In some embodiments, the diameter ofeach microwell can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800,900, 1000, or a number between any two of these values, micrometer. Insome embodiments, the diameter of each microwell can be, or can beabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or a number betweenany two of these values, minimeter.

The microwell depth can be chosen to provide efficient trapping of cellsand solid supports. The microwell depth can be chosen to provideefficient exchange of assay buffers and other reagents contained withinthe wells. The ratio of diameter to height (i.e. aspect ratio) can bechosen such that once a cell and solid support settle inside amicrowell, they will not be displaced by fluid motion above themicrowell. In some embodiments, the height of the microwell can besmaller than the diameter of the bead. For example, the height of themicrowell can be at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60,65, 70, 75, 80, 85, 90, 95, or 100% of the diameter of the bead. Thebead can protrude outside of the microwell.

The dimensions of the microwell can be chosen such that the microwellhas sufficient space to accommodate a solid support and a cell ofvarious sizes without being dislodged by fluid motion above themicrowell. The depth of the microwells can range from about 1-fold toabout 10-fold the diameter of the cells or solid supports to be trappedwithin the microwells. The microwell depth can be at least 1-fold, atleast 1.5-fold, at least 2-folds, at least 3-folds, at least 4-folds, atleast 5-folds, or at least 10-folds the diameter of the cells or solidsupports to be trapped within the microwells. The microwell depth can beat most 10-folds, at most 5-folds, at most 4-folds, at most 3-folds, atmost 2-folds, at most 1.5-fold, or at most 1-fold the diameter of thecells or solid supports to be trapped within the microwells. Themicrowell depth can be about 2.5-folds the diameter of the cells orsolid supports to be trapped within the microwells.

The depth of the microwells can be specified in terms of absolutedimensions. The depth of the microwells can range from about 10 to about60 micrometers. The microwell depth can be at least 10 micrometers, atleast 20 micrometers, at least 25 micrometers, at least 30 micrometers,at least 35 micrometers, at least 40 micrometers, at least 50micrometers, or at least 60 micrometers. The microwell depth can be atmost 60 micrometers, at most 50 micrometers, at most 40 micrometers, atmost 35 micrometers, at most 30 micrometers, at most 25 micrometers, atmost 20 micrometers, or at most 10 micrometers. The microwell depth canbe about 30 micrometers.

In some embodiments, the depth of each microwell can be, or can beabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or a number betweenany two of these values, nanometers. In some embodiments, the depth ofeach microwell can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800,900, 1000, or a number between any two of these values, micrometers. Insome embodiments, the depth of each microwell can be, or can be about,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200,300, 400, 500, 600, 700, 800, 900, 1000, or a number between any two ofthese values, minimeters.

The volume of the microwells used in the methods, devices, and systemsof the present disclosure can vary, for example range from about 200micrometers³ to about 120,000 micrometers³. The microwell volume can beat least 200 micrometers³, at least 500 micrometers³, at least 1,000micrometers³, at least 10,000 micrometers³, at least 25,000micrometers³, at least 50,000 micrometers³, at least 100,000micrometers³, or at least 120,000 micrometers³. The microwell volume canbe at most 120,000 micrometers³, at most 100,000 micrometers³, at most50,000 micrometers³, at most 25,000 micrometers³, at most 10,000micrometers³, at most 1,000 micrometers³, at most 500 micrometers³, orat most 200 micrometers³. The microwell volume can be about 25,000micrometers³. The microwell volume can fall within any range bounded byany of these values (e.g. from about 18,000 micrometers³ to about 30,000micrometers³).

In some embodiments, each of the microwells can have a volume of 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900,1000, or a number between any two of these values, nanoliters. In someembodiments, each of the microwells can have a volume of 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,600, 700, 800, 900, 1000, or a number between any two of these values,microliters. In some embodiments, each of the microwells can have avolume of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or a number betweenany two of these values, miniliters.

The volumes of the microwells used in the methods, devices, and systemsof the present disclosure can be further characterized in terms of thevariation in volume from one microwell to another. The coefficient ofvariation (expressed as a percentage) for microwell volume can rangefrom about 1% to about 10%. The coefficient of variation for microwellvolume can be at least 1%, at least 2%, at least 3%, at least 4%, atleast 5%, at least 6%, at least 7%, at least 8%, at least 9%, or atleast 10%. The coefficient of variation for microwell volume can be atmost 10%, at most 9%, at most 8%, at most 7%, at most 6%, at most 5%, atmost 4%, at most 3%, at most 2%, or at most 1%. The coefficient ofvariation for microwell volume can have any value within a rangeencompassed by these values, for example between about 1.5% and about6.5%. In some embodiments, the coefficient of variation of microwellvolume can be about 2.5%.

The ratio of the volume of the microwells to the surface area of thebeads (or to the surface area of a solid support to which stochasticbarcode oligonucleotides can be attached) used in the methods, devices,and systems of the present disclosure can vary, for example range fromabout 2.5 to about 1,520 micrometers. The ratio can be at least 2.5, atleast 5, at least 10, at least 100, at least 500, at least 750, at least1,000, or at least 1,520. The ratio can be at most 1,520, at most 1,000,at most 750, at most 500, at most 100, at most 10, at most 5, or at most2.5. In some embodiments, the ratio can be, or be about 67.5. The ratioof microwell volume to the surface area of the bead (or solid supportused for immobilization) can fall within any range bounded by any ofthese values (e.g. from about 30 to about 120).

The wells of the microwell array can be arranged in a one dimensional,two dimensional, or three-dimensional array. A three dimensional arraycan be achieved, for example, by stacking a series of two or more twodimensional arrays (that is, by stacking two or more substratescomprising microwell arrays).

The pattern and spacing between microwells can be chosen to optimize theefficiency of trapping a single cell and single solid support (e.g.,bead) in each well, as well as to maximize the number of wells per unitarea of the array. The microwells can be distributed according to avariety of random or non-random patterns. For example, they can bedistributed entirely randomly across the surface of the array substrate,or they can be arranged in a square grid, rectangular grid, hexagonalgrid, or the like. The center-to-center distance (or spacing) betweenwells can vary from about 15 micrometers to about 75 micrometers. Inother embodiments, the spacing between wells is at least 15 micrometers,at least 20 micrometers, at least 25 micrometers, at least 30micrometers, at least 35 micrometers, at least 40 micrometers, at least45 micrometers, at least 50 micrometers, at least 55 micrometers, atleast 60 micrometers, at least 65 micrometers, at least 70 micrometers,or at least 75 micrometers. The microwell spacing can be at most 75micrometers, at most 70 micrometers, at most 65 micrometers, at most 60micrometers, at most 55 micrometers, at most 50 micrometers, at most 45micrometers, at most 40 micrometers, at most 35 micrometers, at most 30micrometers, at most 25 micrometers, at most 20 micrometers, or at most15 micrometers. The microwell spacing can be about 55 micrometers. Themicrowell spacing can fall within any range bounded by any of thesevalues (e.g. from about 18 micrometers to about 72 micrometers).

In some embodiments, microwells can be separated from each other by nomore than 0.01, 0.1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60,70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000, or anumber between any two of these values, micrometers. In someembodiments, the microwells can be separated from one another by no morethan 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,200, 300, 400, 500, 600, 700, 800, 900, 1000, or a number between anytwo of these values, minimeters.

In some embodiments, the microwell array can comprise 100, 200, 300,400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000,8000, 9000, 10000, or a number between any two of these values, wellsper inch. In some embodiments, the microwell array can comprise 10, 20,30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900,1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a numberbetween any two of these values, wells per cm².

The microwell array can comprise surface features between the microwellsthat are designed to help guide cells and solid supports into the wellsand/or prevent them from settling on the surfaces between wells.Examples of suitable surface features can include, but are not limitedto, domed, ridged, or peaked surface features that encircle the wells orstraddle the surface between wells.

The total number of wells in the microwell array can be determined bythe pattern and spacing of the wells and the overall dimensions of thearray. The number of microwells in the array can vary, for example,range from about 96 to about 5,000,000 or more. The number of microwellsin the array can be at least 96, at least 384, at least 1,536, at least5,000, at least 10,000, at least 25,000, at least 50,000, at least75,000, at least 100,000, at least 500,000, at least 1,000,000, or atleast 5,000,000. The number of microwells in the array can be at most5,000,000, at most 1,000,000, at most 75,000, at most 50,000, at most25,000, at most 10,000, at most 5,000, at most 1,536, at most 384, or atmost 96 wells. The number of microwells in the array can be about 96.The number of microwells can be about 150,000. The number of microwellsin the array can fall within any range bounded by any of these values(e.g. from about 100 to 325,000).

Microwell arrays can be fabricated using any of a number of fabricationtechniques. Examples of fabrication methods that can be used include,but are not limited to, bulk micromachining techniques such asphotolithography and wet chemical etching, plasma etching, or deepreactive ion etching; micro-molding and micro-embossing; lasermicro-machining; 3D printing or other direct write fabrication processesusing curable materials; and similar techniques.

Microwell arrays can be fabricated from any of a number of substratematerials. The choice of material can depend on the choice offabrication technique, and vice versa. Examples of suitable materialscan include, but are not limited to, silicon, fused-silica, glass,polymers (e.g. agarose, gelatin, hydrogels, polydimethylsiloxane (PDMS;elastomer), polymethylmethacrylate (PMMA), polycarbonate (PC),polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE),polyimide, cyclic olefin polymers (COP), cyclic olefin copolymers (COL),polyethylene terephthalate (PET), epoxy resins, thiol-ene based resins,metals or metal films (e.g. aluminum, stainless steel, copper, nickel,chromium, and titanium), and the like. A hydrophilic material can bedesirable for fabrication of the microwell arrays (e.g. to enhancewettability and minimize non-specific binding of cells and otherbiological material). Hydrophobic materials that can be treated orcoated (e.g. by oxygen plasma treatment, or grafting of a polyethyleneoxide surface layer) can also be used. The use of porous, hydrophilicmaterials for the fabrication of the microwell array can be desirable inorder to facilitate capillary wicking/venting of entrapped air bubblesin the device. The microwell array can be fabricated from a singlematerial. The microwell array can comprise two or more differentmaterials that have been bonded together or mechanically joined.

Microwell arrays can be fabricated using substrates of any of a varietyof sizes and shapes. For example, the shape (or footprint) of thesubstrate within which microwells are fabricated can be square,rectangular, circular, or irregular in shape. The footprint of themicrowell array substrate can be similar to that of a microtiter plate.The footprint of the microwell array substrate can be similar to that ofstandard microscope slides, e.g. about 75 mm long×25 mm wide (about 3″long×1″ wide), or about 75 mm long×50 mm wide (about 3″ long×2″ wide).The thickness of the substrate within which the microwells arefabricated can range from about 0.1 mm thick to about 10 mm thick, ormore. The thickness of the microwell array substrate can be at least 0.1mm thick, at least 0.5 mm thick, at least 1 mm thick, at least 2 mmthick, at least 3 mm thick, at least 4 mm thick, at least 5 mm thick, atleast 6 mm thick, at least 7 mm thick, at least 8 mm thick, at least 9mm thick, or at least 10 mm thick. The thickness of the microwell arraysubstrate can be at most 10 mm thick, at most 9 mm thick, at most 8 mmthick, at most 7 mm thick, at most 6 mm thick, at most 5 mm thick, atmost 4 mm thick, at most 3 mm thick, at most 2 mm thick, at most 1 mmthick, at most 0.5 mm thick, or at most 0.1 mm thick. The thickness ofthe microwell array substrate can be about 1 mm thick. The thickness ofthe microwell array substrate can be any value within these ranges, forexample, the thickness of the microwell array substrate can be betweenabout 0.2 mm and about 9.5 mm.

A variety of surface treatments and surface modification techniques canbe used to alter the properties of microwell array surfaces. Examplescan include, but are not limited to, oxygen plasma treatments to renderhydrophobic material surfaces more hydrophilic, the use of wet or dryetching techniques to smooth (or roughen) glass and silicon surfaces,adsorption or grafting of polyethylene oxide or other polymer layers(such as pluronic), or bovine serum albumin to substrate surfaces torender them more hydrophilic and less prone to non-specific adsorptionof biomolecules and cells, the use of silane reactions to graftchemically-reactive functional groups to otherwise inert silicon andglass surfaces, etc. Photodeprotection techniques can be used toselectively activate chemically-reactive functional groups at specificlocations in the array structure, for example, the selective addition oractivation of chemically-reactive functional groups such as primaryamines or carboxyl groups on the inner walls of the microwells can beused to covalently couple oligonucleotide probes, peptides, proteins, orother biomolecules to the walls of the microwells. The choice of surfacetreatment or surface modification utilized can depend both or either onthe type of surface property that is desired and on the type of materialfrom which the microwell array is made.

The openings of microwells can be sealed, for example, during cell lysissteps to prevent cross hybridization of target nucleic acid betweenadjacent microwells. A microwell (or array of microwells) can be sealedor capped using, for example, a flexible membrane or sheet of solidmaterial (i.e. a plate or platten) that clamps against the surface ofthe microwell array substrate, or a suitable bead, where the diameter ofthe bead is larger than the diameter of the microwell.

A seal formed using a flexible membrane or sheet of solid material cancomprise, for example, inorganic nanopore membranes (e.g., aluminumoxides), dialysis membranes, glass slides, coverslips, elastomeric films(e.g. PDMS), or hydrophilic polymer films (e.g., a polymer film coatedwith a thin film of agarose that has been hydrated with lysis buffer).

Solid supports (e.g., beads) used for capping the microwells cancomprise any of the solid supports (e.g., beads) of the disclosure. Insome embodiments, the solid supports are cross-linked dextran beads(e.g., Sephadex). Cross-linked dextran can range from about 10micrometers to about 80 micrometers. The cross-linked dextran beads usedfor capping can be from 20 micrometers to about 50 micrometers. In someembodiments, the beads can be at least about 10, 20, 30, 40, 50, 60, 70,80 or 90% larger than the diameter of the microwells. The beads used forcapping can be at most about 10, 20, 30, 40, 50, 60, 70, 80 or 90%larger than the diameter of the microwells.

The seal or cap can allow buffer to pass into and out of the microwell,while preventing macromolecules (e.g., nucleic acids) from migrating outof the well. A macromolecule of at least about 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more nucleotides can beblocked from migrating into or out of the microwell by the seal or cap.A macromolecule of at most about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13,14, 15, 16, 17, 18, 19, or 20 or more nucleotides can be blocked frommigrating into or out of the microwell by the seal or cap.

Solid supports (e.g., beads) can be distributed among a substrate. Solidsupports (e.g., beads) can be distributed among wells of the substrate,removed from the wells of the substrate, or otherwise transportedthrough a device comprising one or more microwell arrays by means ofcentrifugation or other non-magnetic means. A microwell of a substratecan be pre-loaded with a solid support. A microwell of a substrate canhold at least 1, 2, 3, 4, or 5, or more solid supports. A microwell of asubstrate can hold at most 1, 2, 3, 4, or 5 or more solid supports. Insome embodiments, a microwell of a substrate can hold one solid support.

Individual cells and beads can be compartmentalized using alternativesto microwells, for example, a single solid support and single cell couldbe confined within a single droplet in an emulsion (e.g. in a dropletdigital microfluidic system).

Cells could potentially be confined within porous beads that themselvescomprise the plurality of tethered stochastic barcodes. Individual cellsand solid supports can be compartmentalized in any type of container,microcontainer, reaction chamber, reaction vessel, or the like.

Single cell, stochastic barcoding or can be performed without the use ofmicrowells. Single cell, stochastic barcoding assays can be performedwithout the use of any physical container. For example, stochasticbarcoding without a physical container can be performed by embeddingcells and beads in close proximity to each other within a polymer layeror gel layer to create a diffusional barrier between different cell/beadpairs. In another example, stochastic barcoding without a physicalcontainer can be performed in situ, in vivo, on an intact solid tissue,on an intact cell, and/or subcellularly.

Microwell arrays can be a consumable component of the assay system.Microwell arrays can be reusable. Microwell arrays can be configured foruse as a stand-alone device for performing assays manually, or they canbe configured to comprise a fixed or removable component of aninstrument system that provides for full or partial automation of theassay procedure. In some embodiments of the disclosed methods, thebead-based libraries of stochastic barcodes can be deposited in thewells of the microwell array as part of the assay procedure. In someembodiments, the beads can be pre-loaded into the wells of the microwellarray and provided to the user as part of, for example, a kit forperforming stochastic barcoding and digital counting of nucleic acidtargets.

In some embodiments, two mated microwell arrays can be provided, onepre-loaded with beads which are held in place by a first magnet, and theother for use by the user in loading individual cells. Followingdistribution of cells into the second microwell array, the two arrayscan be placed face-to-face and the first magnet removed while a secondmagnet is used to draw the beads from the first array down into thecorresponding microwells of the second array, thereby ensuring that thebeads rest above the cells in the second microwell array and thusminimizing diffusional loss of target molecules following cell lysis,while maximizing efficient attachment of target molecules to thestochastic barcodes on the bead.

In some embodiments, a substrate does not include microwells. Forexample, beads can be assembled (e.g., self-assembled). The beads canself-assemble into a mono-layer. The monolayer can be on a flat surfaceof the substrate. The monolayer can be on a curved surface of thesubstrate. The bead monolayer can be formed by any method, such asalcohol evaporation.

Three-Dimensional Substrates

A three-dimensional array can be any shape. A three-dimensionalsubstrate can be made of any material used in a substrate of thedisclosure. In some embodiments, a three-dimensional substrate comprisesa DNA origami. DNA origami structures incorporate DNA as a buildingmaterial to make nanoscale shapes. The DNA origami process can involvethe folding of one or more long, “scaffold” DNA strands into aparticular shape using a plurality of rationally designed “staple DNAstrands. The sequences of the staple strands can be designed such thatthey hybridize to particular portions of the scaffold strands and, indoing so, force the scaffold strands into a particular shape. The DNAorigami can include a scaffold strand and a plurality of rationallydesigned staple strands. The scaffold strand can have any sufficientlynon-repetitive sequence.

The sequences of the staple strands can be selected such that the DNAorigami has at least one shape to which stochastic labels can beattached. In some embodiments, the DNA origami can be of any shape thathas at least one inner surface and at least one outer surface. An innersurface can be any surface area of the DNA origami that is stericallyprecluded from interacting with the surface of a sample, while an outersurface is any surface area of the DNA origami that is not stericallyprecluded from interacting with the surface of a sample. In someembodiments, the DNA origami has one or more openings (e.g., twoopenings), such that an inner surface of the DNA origami can be accessedby particles (e.g., solid supports). For example, in certain embodimentsthe DNA origami has one or more openings that allow particles smallerthan 10 micrometers, 5 micrometers, 1 micrometer, 500 nm, 400 nm, 300nm, 250 nm, 200 nm, 150 nm, 100 nm, 75 nm, 50 nm, 45 nm or 40 nm tocontact an inner surface of the DNA origami.

The DNA origami can change shape (conformation) in response to one ormore certain environmental stimuli. Thus an area of the DNA origami canbe an inner surface when the DNA origami takes on some conformations,but can be an outer surface when the device takes on otherconformations. In some embodiments, the DNA origami can respond tocertain environmental stimuli by taking on a new conformation.

In some embodiments, the staple strands of the DNA origami can beselected such that the DNA origami is substantially barrel- ortube-shaped. The staples of the DNA origami can be selected such thatthe barrel shape is closed at both ends or is open at one or both ends,thereby permitting particles to enter the interior of the barrel andaccess its inner surface. In certain embodiments, the barrel shape ofthe DNA origami can be a hexagonal tube.

In some embodiments, the staple strands of the DNA origami can beselected such that the DNA origami has a first domain and a seconddomain, wherein the first end of the first domain is attached to thefirst end of the second domain by one or more single-stranded DNAhinges, and the second end of the first domain is attached to the seconddomain of the second domain by the one or more molecular latches. Theplurality of staples can be selected such that the second end of thefirst domain becomes unattached to the second end of the second domainif all of the molecular latches are contacted by their respectiveexternal stimuli. Latches can be formed from two or more staple stands,including at least one staple strand having at least onestimulus-binding domain that is able to bind to an external stimulus,such as a nucleic acid, a lipid or a protein, and at least one otherstaple strand having at least one latch domain that binds to thestimulus binding domain. The binding of the stimulus-binding domain tothe latch domain supports the stability of a first conformation of theDNA origami.

Spatial labels can be delivered to a sample in three dimensions. Forexample a sample can be associated with an array, wherein the array hasspatial labels distributed or distributable in three dimensions. A threedimensional array can be a scaffolding, a porous substrate, a gel, aseries of channels, or the like.

A three dimensional pattern of spatial labels can be associated with asample by injecting the samples into known locations with the sample,for example using a robot. A single needle can be used to seriallyinject spatial labels at different depths into a sample. An array ofneedles can inject spatial labels at different depths to generate athree dimensional distribution of labels.

In some embodiments, a three dimensional solid support can be a device.For example, a needle array device (e.g., a biopsy needle array device)can be a substrate. Stochastic barcodes of the disclosure can beattached to the device. Placing the device in and/or on a sample canbring the stochastic barcodes of the disclosure into proximity withtargets in and/or on the sample. Different parts of the device can havestochastic barcodes with different spatial labels. For example, on aneedle array device, each needle of the device can be coated withstochastic barcodes with different spatial labels on each needle. Inthis way, spatial labels can provide information about the location ofthe targets (e.g., location in orientation to the needle array).

Probes

The solid support/substrate of the disclosure can comprise a pluralityof probes. The probes can be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, or 20 or more nucleotides in length. Theprobes can be at most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, or 20 or more nucleotides in length.

The probes can be oligo(dT) probes. The probes can be any homopolymersequence (e.g., poly(A), poly(C), poly (G), poly(U)).

The probes can be gene-specific. The probes can target any location of agene (e.g., 3′ UTR, 5′ UTR, coding region, promoter). The probes on thesubstrate can be gene-specific for a plurality of genes. For example, asubstrate can comprise probes that are gene-specific for at least 10,20, 30, 40, 50, 60, 70, 80, 90, or 100 or more genes. A substrate cancomprise probes that are gene-specific for at least 10, 20, 30, 40, 50,60, 70, 80, 90, or 100 or more genes. The plurality of gene-specificprobes can be dispersed throughout the substrate evenly. The pluralityof gene-specific probes can be dispersed throughout the substrate indiscrete locations. There can be an equivalent number of gene-specificprobes for each gene. There can be an inequivalent number ofgene-specific probes for each gene. For examples, one or moregene-specific probes can be represented on the substrate at least 10,20, 30, 40, 50, 60, 70, or 80% or more compared to one or more othergene-specific probes. One or more gene-specific probes can berepresented on the substrate at most 10, 20, 30, 40, 50, 60, 70, or 80%or more compared to one or more other gene-specific probes.

The substrate can comprise a plurality of gene-specific probes for aplurality of genes and a plurality of oligo(dT) probes. The combinationof gene-specific probes and oligo (dT) probes can be useful for bridgeamplification methods of the disclosure. The ratio of a gene-specificprobe to an oligo(dT) probe can be at least 1:1, 1:2, 1:3, 1:4, or 1:5or more. The ratio of a gene-specific probe to an oligo(dT) probe can beat most 1:1, 1:2, 1:3, 1:4, or 1:5 or more. The ratio of an oligo(dT)probe to a gene-specific probe can be at least 1:1, 1:2, 1:3, 1:4, or1:5 or more. The ratio of an oligo(dT) probe to a gene-specific probecan be at most 1:1, 1:2, 1:3, 1:4, or 1:5 or more.

The probes on the replicate substrate can comprise any of the probes, orcombination of probes of the disclosure. The probes on the replicatesubstrate can be the same as the initial substrate. The probes on thereplicate substrate can be different from the initial substrate. Forexample, the probes on the initial substrate can be gene-specific for afirst location of a gene. The probes on the replicate slide can begene-specific for a second location on the same gene. In this way, theprobes can be used to identify (e.g., generate and/or detect) multipleamplicons from the same gene. The multiple amplicons can comprisedifferent genetic features such as SNPs. Identification of multipleamplicons on the same gene can be useful for identification of SNPsand/or genetic mobility events (e.g., truncations, translocations,transpositions).

In some embodiments, the probes on the initial substrate can beoligo(dT) and the probes on the replicate substrate can be gene-specificor a combination of gene-specific and oligo(dT).

Synthesis of Stochastic Barcodes on Solid Supports and Substrates

A stochastic barcode can be synthesized on a solid support (e.g., bead).Pre-synthesized stochastic barcodes (e.g., comprising the 5′amine thatcan link to the solid support) can be attached to solid supports (e.g.,beads) through any of a variety of immobilization techniques involvingfunctional group pairs on the solid support and the stochastic barcode.The stochastic barcode can comprise a functional group. The solidsupport (e.g., bead) can comprise a functional group. The stochasticbarcode functional group and the solid support functional group cancomprise, for example, biotin, streptavidin, primary amine(s),carboxyl(s), hydroxyl(s), aldehyde(s), ketone(s), and any combinationthereof. A stochastic barcode can be tethered to a solid support, forexample, by coupling (e.g. using 1-Ethyl-3-(3-dimethylaminopropyl)carbodiimide) a 5′ amino group on the stochastic barcode to the carboxylgroup of the functionalized solid support. Residual non-coupledstochastic barcodes can be removed from the reaction mixture byperforming multiple rinse steps. In some embodiments, the stochasticbarcode and solid support are attached indirectly via linker molecules(e.g. short, functionalized hydrocarbon molecules or polyethylene oxidemolecules) using similar attachment chemistries. The linkers can becleavable linkers, e.g. acid-labile linkers or photo-cleavable linkers.

The stochastic barcodes can be synthesized on solid supports (e.g.,beads) using any of a number of solid-phase oligonucleotide synthesistechniques, such as phosphodiester synthesis, phosphotriester synthesis,phosphite triester synthesis, and phosphoramidite synthesis. Singlenucleotides can be coupled in step-wise fashion to the growing, tetheredstochastic barcode. A short, pre-synthesized sequence (or block) ofseveral oligonucleotides can be coupled to the growing, tetheredstochastic barcode.

Stochastic barcodes can be synthesized by interspersing step-wise orblock coupling reactions with one or more rounds of split-poolsynthesis, in which the total pool of synthesis beads is divided into anumber of individual smaller pools which are then each subjected to adifferent coupling reaction, followed by recombination and mixing of theindividual pools to randomize the growing stochastic barcode sequenceacross the total pool of beads. Split-pool synthesis is an example of acombinatorial synthesis process in which a maximum number of chemicalcompounds are synthesized using a minimum number of chemical couplingsteps. The potential diversity of the compound library thus created isdetermined by the number of unique building blocks (e.g. nucleotides)available for each coupling step, and the number of coupling steps usedto create the library. For example, a split-pool synthesis comprising 10rounds of coupling using 4 different nucleotides at each step will yield4¹⁰=1,048,576 unique nucleotide sequences. In some embodiments,split-pool synthesis can be performed using enzymatic methods such aspolymerase extension or ligation reactions rather than chemicalcoupling. For example, in each round of a split-pool polymeraseextension reaction, the 3′ ends of the stochastic barcodes tethered tobeads in a given pool can be hybridized with the 5′ends of a set ofsemi-random primers, e.g. primers having a structure of5′-(M)_(k)-(X)_(i)—(N)_(j)-3′, where (X)_(i) is a random sequence ofnucleotides that is i nucleotides long (the set of primers comprisingall possible combinations of (X)_(i)), (N)_(j) is a specific nucleotide(or series of j nucleotides), and (M)_(k) is a specific nucleotide (orseries of k nucleotides), wherein a different deoxyribonucleotidetriphosphate (dNTP) is added to each pool and incorporated into thetethered oligonucleotides by the polymerase.

The number of stochastic barcodes conjugated to or synthesized on asolid support can comprise at least 100, 1000, 10000, or 1000000 or morestochastic barcodes. The number of stochastic barcodes conjugated to orsynthesized on a solid support can comprise at most 100, 1000, 10000, or1000000 or more stochastic barcodes. The number of oligonucleotidesconjugated to or synthesized on a solid support such as a bead can be atleast 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10-folds more than the number oftarget nucleic acids in a cell. The number of oligonucleotidesconjugated to or synthesized on a solid support such as a bead can be atmost 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10-folds more than the number oftarget nucleic acids in a cell. At least 10, 20, 30, 40, 50, 60, 70, 80,90 or 100% of the stochastic barcode can be bound by a target nucleicacid. At most 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100% of thestochastic barcode can be bound by a target nucleic acid. At least 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100 or moredifferent target nucleic acids can be captured by the stochastic barcodeon the solid support. At most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40,50, 60, 70, 80, 90 or 100 or more different target nucleic acids can becaptured by the stochastic barcode on the solid support.

In some embodiments, stochastic barcodes can be synthesized by randomlydistributing a single-stranded DNA mixture onto a substrate pre-coatedwith primers. The single-stranded DNA can hybridize to the primers.Bridge amplification can be performed to convert the single-strandedDNAs into a cluster. Sequencing can be performed to determine thesequence of the DNA at each cluster on the substrate. A sample can beapplied to the substrate, followed by the stochastic barcoding methodsof the disclosure.

In some embodiments, barcodes can be synthesized using size and/orelectrophoretic mobility. For example, a mixture of stochastic barcodescan be prepared and separated into two-dimensions using gelelectrophoresis. The gel can be the substrate.

Methods of Stochastic Barcoding

The disclosure provides for methods for estimating the number ofdistinct targets at distinct locations in a physical sample (e.g.,tissue, organ, tumor, cell). The methods can comprise placing thestochastic barcodes in close proximity with the sample, lysing thesample, associating distinct targets with the stochastic barcodes,amplifying the targets and/or digitally counting the targets. The methodcan further comprise analyzing and/or visualizing the informationobtained from the spatial labels on the stochastic barcodes. In someembodiments, the methods comprise visualizing the plurality of targetsin the sample. Mapping the plurality of targets onto the map of thesample can include generating a two dimensional map or a threedimensional map of the sample. The two dimensional map and the threedimensional map can be generated prior to or after stochasticallybarcoding the plurality of targets in the sample. Visualizing theplurality of targets in the sample can include mapping the plurality oftargets onto a map of the sample. Mapping the plurality of targets ontothe map of the sample can include generating a two dimensional map or athree dimensional map of the sample. The two dimensional map and thethree dimensional map can be generated prior to or after stochasticallybarcoding the plurality of targets in the sample. in some embodiments,the two dimensional map and the three dimensional map can be generatedbefore or after lysing the sample. Lysing the sample before or aftergenerating the two dimensional map or the three dimensional map caninclude heating the sample, contacting the sample with a detergent,changing the pH of the sample, or any combination thereof.

FIG. 3 illustrates an exemplary embodiment of the stochastic barcodingmethod of the disclosure. A sample (e.g., section of a sample, thinslice, and cell) can be contacted with a solid support comprising astochastic barcode. Targets in the sample can be associated with thestochastic barcodes. The solid supports can be collected. cDNA synthesiscan be performed on the solid support. cDNA synthesis can be performedoff the solid support. cDNA synthesis can incorporate the labelinformation from the labels in the stochastic barcode into the new cDNAtarget molecule being synthesized, thereby generating a target-barcodemolecule. The target-barcode molecules can be amplified using PCT. Thesequence of the targets and the labels of the stochastic barcode on thetarget-barcode molecule can be determined by sequencing methods.

Contacting a Sample and a Stochastic Barcode

The disclosure provides for methods for contacting a sample (e.g.,cells) to a substrate of the disclosure. A sample comprising, forexample, a cell, organ, or tissue thin section, can be contacted tostochastic barcodes. The cells can be contacted, for example, by gravityflow wherein the cells can settle and create a monolayer. The sample canbe a tissue thin section. The thin section can be placed on thesubstrate. The sample can be one-dimensional (e.g., form a planarsurface). The sample (e.g., cells) can be spread across the substrate,for example, by growing/culturing the cells on the substrate.

When stochastic barcodes are in close proximity to targets, the targetscan hybridize to the stochastic barcode. The stochastic barcodes can becontacted at a non-depletable ratio such that each distinct target canassociate with a distinct stochastic barcode of the disclosure. Toensure efficient association between the target and the stochasticbarcode, the targets can be crosslinked to the stochastic barcode.

Cell Lysis Following the distribution of cells and stochastic barcodes,the cells can be lysed to liberate the target molecules. Cell lysis canbe accomplished by any of a variety of means, for example, by chemicalor biochemical means, by osmotic shock, or by means of thermal lysis,mechanical lysis, or optical lysis. Cells can be lysed by addition of acell lysis buffer comprising a detergent (e.g. SDS, Li dodecyl sulfate,Triton X-100, Tween-20, or NP-40), an organic solvent (e.g. methanol oracetone), or digestive enzymes (e.g. proteinase K, pepsin, or trypsin),or any combination thereof. To increase the association of a target anda stochastic barcode, the rate of the diffusion of the target moleculescan be altered by for example, reducing the temperature and/orincreasing the viscosity of the lysate.

In some embodiments, the sample can be lysed using a filter paper. Thefilter paper can be soaked with a lysis buffer on top of the filterpaper. The filter paper can be applied to the sample with pressure whichcan facilitate lysis of the sample and hybridization of the targets ofthe sample to the substrate.

In some embodiments, lysis can be performed by mechanical lysis, heatlysis, optical lysis, and/or chemical lysis. Chemical lysis can includethe use of digestive enzymes such as proteinase K, pepsin, and trypsin.Lysis can be performed by the addition of a lysis buffer to thesubstrate. A lysis buffer can comprise Tris HCl. A lysis buffer cancomprise at least about 0.01, 0.05, 0.1, 0.5, or 1M or more Tris HCl. Alysis buffer can comprise at most about 0.01, 0.05, 0.1, 0.5, or 1M ormore Tris HCL. A lysis buffer can comprise about 0.1 M Tris HCl. The pHof the lysis buffer can be at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, or10 or more. The pH of the lysis buffer can be at most about 1, 2, 3, 4,5, 6, 7, 8, 9, or 10 or more. In some embodiments, the pH of the lysisbuffer is about 7.5. The lysis buffer can comprise a salt (e.g., LiCl).The concentration of salt in the lysis buffer can be at least about 0.1,0.5, or 1M or more. The concentration of salt in the lysis buffer can beat most about 0.1, 0.5, or 1M or more. In some embodiments, theconcentration of salt in the lysis buffer is about 0.5M. The lysisbuffer can comprise a detergent (e.g., SDS, Li dodecyl sulfate, tritonX, tween, NP-40). The concentration of the detergent in the lysis buffercan be at least about 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, 0.1,0.5, 1, 2, 3, 4, 5, 6, or 7% or more. The concentration of the detergentin the lysis buffer can be at most about 0.0001, 0.0005, 0.001, 0.005,0.01, 0.05, 0.1, 0.5, 1, 2, 3, 4, 5, 6, or 7% or more. In someembodiments, the concentration of the detergent in the lysis buffer isabout 1% Li dodecyl sulfate. The time used in the method for lysis canbe dependent on the amount of detergent used. In some embodiments, themore detergent used, the less time needed for lysis. The lysis buffercan comprise a chelating agent (e.g., EDTA, EGTA). The concentration ofa chelating agent in the lysis buffer can be at least about 1, 5, 10,15, 20, 25, or 30 mM or more. The concentration of a chelating agent inthe lysis buffer can be at most about 1, 5, 10, 15, 20, 25, or 30 mM ormore. In some embodiments, the concentration of chelating agent in thelysis buffer is about 10 mM. The lysis buffer can comprise a reducingreagent (e.g., beta-mercaptoethanol, DTT). The concentration of thereducing reagent in the lysis buffer can be at least about 1, 5, 10, 15,or 20 mM or more. The concentration of the reducing reagent in the lysisbuffer can be at most about 1, 5, 10, 15, or 20 mM or more. In someembodiments, the concentration of reducing reagent in the lysis bufferis about 5 mM. In some embodiments, a lysis buffer can comprise about0.1M TrisHCl, about pH 7.5, about 0.5M LiCl, about 1% lithium dodecylsulfate, about 10 mM EDTA, and about 5 mM DTT.Lysis can be performed ata temperature of about 4, 10, 15, 20, 25, or 30 C. Lysis can beperformed for about 1, 5, 10, 15, or 20 or more minutes. A lysed cellcan comprise at least about 100000, 200000, 300000, 400000, 500000,600000, or 700000 or more target nucleic acid molecules. A lysed cellcan comprise at most about 100000, 200000, 300000, 400000, 500000,600000, or 700000 or more target nucleic acid molecules.

Attachment of Stochastic Barcodes to Target Nucleic Acid Molecules

Following lysis of the cells and release of nucleic acid moleculestherefrom, the nucleic acid molecules can randomly associate with thestochastic barcodes of the co-localized solid support. Association cancomprise hybridization of a stochastic barcode's target recognitionregion to a complementary portion of the target nucleic acid molecule(e.g., oligo(dT) of the stochastic barcode can interact with a poly(A)tail of a target). The assay conditions used for hybridization (e.g.buffer pH, ionic strength, temperature, etc.) can be chosen to promoteformation of specific, stable hybrids. In some embodiments, the nucleicacid molecules released from the lysed cells can associate with theplurality of probes on the substrate (e.g., hybridize with the probes onthe substrate). When the probes comprise oligo(dT), mRNA molecules canhybridize to the probes and be reverse transcribed. The oligo(dT)portion of the oligonucleotide can act as a primer for first strandsynthesis of the cDNA molecule.

Attachment can further comprise ligation of a stochastic barcode'starget recognition region and a portion of the target nucleic acidmolecule. For example, the target binding region can comprise a nucleicacid sequence that can be capable of specific hybridization to arestriction site overhang (e.g. an EcoRI sticky-end overhang). The assayprocedure can further comprise treating the target nucleic acids with arestriction enzyme (e.g. EcoRI) to create a restriction site overhang.The stochastic barcode can then be ligated to any nucleic acid moleculecomprising a sequence complementary to the restriction site overhang. Aligase (e.g., T4 DNA ligase) can be used to join the two fragments.

The labeled targets from a plurality of cells (or a plurality ofsamples) (e.g., target-barcode molecules) can be subsequently pooled,for example by retrieving the stochastic barcodes and/or the beads towhich the target-barcode molecules are attached. The retrieval of solidsupport-based collections of attached target-barcode molecules can beimplemented by use of magnetic beads and an externally-applied magneticfield. Once the target-barcode molecules have been pooled, all furtherprocessing can proceed in a single reaction vessel. Further processingcan include, for example, reverse transcription reactions, amplificationreactions, cleavage reactions, dissociation reactions, and/or nucleicacid extension reactions. Further processing reactions can be performedwithin the microwells, that is, without first pooling the labeled targetnucleic acid molecules from a plurality of cells.

Reverse Transcription

The disclosure provides for a method to create a stochastictarget-barcode conjugate using reverse transcription. The stochastictarget-barcode conjugate can comprise the stochastic barcode and acomplementary sequence of all or a portion of the target nucleic acid(i.e. a stochastically barcoded cDNA molecule). Reverse transcription ofthe associated RNA molecule can occur by the addition of a reversetranscription primer along with the reverse transcriptase. The reversetranscription primer can be an oligo-dT primer, a random hexanucleotideprimer, or a target-specific oligonucleotide primer. Oligo-dT primerscan be, or can be about, 12-18 nucleotides in length and bind to theendogenous poly(A) tail at the 3′ end of mammalian mRNA. Randomhexanucleotide primers can bind to mRNA at a variety of complementarysites. Target-specific oligonucleotide primers typically selectivelyprime the mRNA of interest.

In some embodiments, reverse transcription of the labeled-RNA moleculecan occur by the addition of a reverse transcription primer. In someembodiments, the reverse transcription primer is an oligo(dT) primer,random hexanucleotide primer, or a target-specific oligonucleotideprimer. Generally, oligo(dT) primers are 12-18 nucleotides in length andbind to the endogenous poly(A)+ tail at the 3′ end of mammalian mRNA.Random hexanucleotide primers can bind to mRNA at a variety ofcomplementary sites. Target-specific oligonucleotide primers typicallyselectively prime the mRNA of interest.

Reverse transcription can occur repeatedly to produce multiplelabeled-cDNA molecules. The methods disclosed herein can compriseconducting at least about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14,15, 16, 17, 18, 19, or 20 reverse transcription reactions. The methodcan comprise conducting at least about 25, 30, 35, 40, 45, 50, 55, 60,65, 70, 75, 80, 85, 90, 95, or 100 reverse transcription reactions.

Amplification

One or more nucleic acid amplification reactions can be performed tocreate multiple copies of the labeled target nucleic acid molecules.Amplification can be performed in a multiplexed manner, wherein multipletarget nucleic acid sequences are amplified simultaneously. Theamplification reaction can be used to add sequencing adaptors to thenucleic acid molecules. The amplification reactions can compriseamplifying at least a portion of a sample label, if present. Theamplification reactions can comprise amplifying at least a portion ofthe cellular and/or molecular label. The amplification reactions cancomprise amplifying at least a portion of a sample tag, a cellularlabel, a spatial label, a molecular label, a target nucleic acid, or acombination thereof. The amplification reactions can comprise amplifying0.5%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%,40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 100%,or a range or a number between any two of these values, of the pluralityof nucleic acids. The method can further comprise conducting one or morecDNA synthesis reactions to produce one or more cDNA copies oftarget-barcode molecules comprising a sample label, a cellular label, aspatial label, and/or a molecular label.

In some embodiments, amplification can be performed using a polymerasechain reaction (PCR). As used herein, PCR can refer to a reaction forthe in vitro amplification of specific DNA sequences by the simultaneousprimer extension of complementary strands of DNA. As used herein, PCRcan encompass derivative forms of the reaction, including but notlimited to, RT-PCR, real-time PCR, nested PCR, quantitative PCR,multiplexed PCR, digital PCR, and assembly PCR.

Amplification of the labeled nucleic acids can comprise non-PCR basedmethods. Examples of non-PCR based methods include, but are not limitedto, multiple displacement amplification (MDA), transcription-mediatedamplification (TMA), nucleic acid sequence-based amplification (NASBA),strand displacement amplification (SDA), real-time SDA, rolling circleamplification, or circle-to-circle amplification. Other non-PCR-basedamplification methods include multiple cycles of DNA-dependent RNApolymerase-driven RNA transcription amplification or RNA-directed DNAsynthesis and transcription to amplify DNA or RNA targets, a ligasechain reaction (LCR), and a Qβ replicase (Qβ) method, use of palindromicprobes, strand displacement amplification, oligonucleotide-drivenamplification using a restriction endonuclease, an amplification methodin which a primer is hybridized to a nucleic acid sequence and theresulting duplex is cleaved prior to the extension reaction andamplification, strand displacement amplification using a nucleic acidpolymerase lacking 5′ exonuclease activity, rolling circleamplification, and ramification extension amplification (RAM). In someembodiments, the amplification does not produce circularizedtranscripts.

In some embodiments, the methods disclosed herein further compriseconducting a polymerase chain reaction on the labeled nucleic acid(e.g., labeled-RNA, labeled-DNA, labeled-cDNA) to produce astochastically labeled-amplicon. The labeled-amplicon can bedouble-stranded molecule. The double-stranded molecule can comprise adouble-stranded RNA molecule, a double-stranded DNA molecule, or a RNAmolecule hybridized to a DNA molecule. One or both of the strands of thedouble-stranded molecule can comprise a sample label, a spatial label, acellular label, and/or a molecular label. The stochasticallylabeled-amplicon can be a single-stranded molecule. The single-strandedmolecule can comprise DNA, RNA, or a combination thereof. The nucleicacids of the disclosure can comprise synthetic or altered nucleic acids.

Amplification can comprise use of one or more non-natural nucleotides.Non-natural nucleotides can comprise photolabile or triggerablenucleotides. Examples of non-natural nucleotides can include, but arenot limited to, peptide nucleic acid (PNA), morpholino and lockednucleic acid (LNA), as well as glycol nucleic acid (GNA) and threosenucleic acid (TNA). Non-natural nucleotides can be added to one or morecycles of an amplification reaction. The addition of the non-naturalnucleotides can be used to identify products as specific cycles or timepoints in the amplification reaction.

Conducting the one or more amplification reactions can comprise the useof one or more primers. The one or more primers can comprise, forexample, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or morenucleotides. The one or more primers can comprise at least 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or more nucleotides. The one ormore primers can comprise less than 12-15 nucleotides. The one or moreprimers can anneal to at least a portion of the plurality ofstochastically labeled targets. The one or more primers can anneal tothe 3′ end or 5′ end of the plurality of stochastically labeled targets.The one or more primers can anneal to an internal region of theplurality of stochastically labeled targets. The internal region can beat least about 50, 100, 150, 200, 220, 230, 240, 250, 260, 270, 280,290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420,430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560,570, 580, 590, 600, 650, 700, 750, 800, 850, 900 or 1000 nucleotidesfrom the 3′ ends the plurality of stochastically labeled targets. Theone or more primers can comprise a fixed panel of primers. The one ormore primers can comprise at least one or more custom primers. The oneor more primers can comprise at least one or more control primers. Theone or more primers can comprise at least one or more gene-specificprimers.

The one or more primers can comprise a universal primer. The universalprimer can anneal to a universal primer binding site. The one or morecustom primers can anneal to a first sample label, a second samplelabel, a spatial label, a cellular label, a molecular label, a target,or any combination thereof. The one or more primers can comprise auniversal primer and a custom primer. The custom primer can be designedto amplify one or more targets. The targets can comprise a subset of thetotal nucleic acids in one or more samples. The targets can comprise asubset of the total stochastically labeled targets in one or moresamples. The one or more primers can comprise at least 96 or more customprimers. The one or more primers can comprise at least 960 or morecustom primers. The one or more primers can comprise at least 9600 ormore custom primers. The one or more custom primers can anneal to two ormore different labeled nucleic acids. The two or more different labelednucleic acids can correspond to one or more genes.

Any amplification scheme can be used in the methods of the presentdisclosure. For example, in one scheme, the first round PCR can amplifymolecules attached to the bead using a gene specific primer and a primeragainst the universal Illumina sequencing primer 1 sequence. The secondround of PCR can amplify the first PCR products using a nested genespecific primer flanked by Illumina sequencing primer 2 sequence, and aprimer against the universal Illumina sequencing primer 1 sequence. Thethird round of PCR adds P5 and P7 and sample index to turn PCR productsinto an Illumina sequencing library. Sequencing using 150 bp×2sequencing can reveal the cellular label and molecular index on read 1,the gene on read 2, and the sample index on index 1 read.

In some embodiments, nucleic acids can be removed from the substrateusing chemical cleavage. For example, a chemical group or a modifiedbase present in a nucleic acid can be used to facilitate its removalfrom a solid support. For example, an enzyme can be used to remove anucleic acid from a substrate. For example, a nucleic acid can beremoved from a substrate through a restriction endonuclease digestion.For example, treatment of a nucleic acid containing a dUTP or ddUTP withuracil-d-glycosylase (UDG) can be used to remove a nucleic acid from asubstrate. For example, a nucleic acid can be removed from a substrateusing an enzyme that performs nucleotide excision, such as a baseexcision repair enzyme, such as an apurinic/apyrimidinic (AP)endonuclease. In some embodiments, a nucleic acid can be removed from asubstrate using a photocleavable group and light. In some embodiments, acleavable linker can be used to remove a nucleic acid from thesubstrate. For example, the cleavable linker can comprise at least oneof biotin/avidin, biotin/streptavidin, biotin/neutravidin, Ig-protein A,a photo-labile linker, acid or base labile linker group, or an aptamer.

When the probes are gene-specific, the molecules can hybridize to theprobes and be reverse transcribed and/or amplified. In some embodiments,after the nucleic acid has been synthesized (e.g., reverse transcribed),it can be amplified. Amplification can be performed in a multiplexmanner, wherein multiple target nucleic acid sequences are amplifiedsimultaneously. Amplification can add sequencing adaptors to the nucleicacid.

In some embodiments, amplification can be performed on the substrate,for example, with bridge amplification. cDNAs can be homopolymer tailedin order to generate a compatible end for bridge amplification usingoligo(dT) probes on the substrate. In bridge amplification, the primerthat is complementary to the 3′ end of the template nucleic acid can bethe first primer of each pair that is covalently attached to the solidparticle. When a sample containing the template nucleic acid iscontacted with the particle and a single thermal cycle is performed, thetemplate molecule can be annealed to the first primer and the firstprimer is elongated in the forward direction by addition of nucleotidesto form a duplex molecule consisting of the template molecule and anewly formed DNA strand that is complementary to the template. In theheating step of the next cycle, the duplex molecule can be denatured,releasing the template molecule from the particle and leaving thecomplementary DNA strand attached to the particle through the firstprimer. In the annealing stage of the annealing and elongation step thatfollows, the complementary strand can hybridize to the second primer,which is complementary to a segment of the complementary strand at alocation removed from the first primer. This hybridization can cause thecomplementary strand to form a bridge between the first and secondprimers secured to the first primer by a covalent bond and to the secondprimer by hybridization. In the elongation stage, the second primer canbe elongated in the reverse direction by the addition of nucleotides inthe same reaction mixture, thereby converting the bridge to adouble-stranded bridge. The next cycle then begins, and thedouble-stranded bridge can be denatured to yield two single-strandednucleic acid molecules, each having one end attached to the particlesurface via the first and second primers, respectively, with the otherend of each unattached. In the annealing and elongation step of thissecond cycle, each strand can hybridize to a further complementaryprimer, previously unused, on the same particle, to form newsingle-strand bridges. The two previously unused primers that are nowhybridized elongate to convert the two new bridges to double-strandbridges.

The amplification reactions can comprise amplifying at least 1%, 2%, 3%,4%, 5%, 6%, 7%, 8%, 9%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%,55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, or 100% of theplurality of nucleic acids.

Amplification of the labeled nucleic acids can comprise PCR-basedmethods or non-PCR based methods. Amplification of the labeled nucleicacids can comprise exponential amplification of the labeled nucleicacids. Amplification of the labeled nucleic acids can comprise linearamplification of the labeled nucleic acids. Amplification can beperformed by polymerase chain reaction (PCR). PCR can refer to areaction for the in vitro amplification of specific DNA sequences by thesimultaneous primer extension of complementary strands of DNA. PCR canencompass derivative forms of the reaction, including but not limitedto, RT-PCR, real-time PCR, nested PCR, quantitative PCR, multiplexedPCR, digital PCR, suppression PCR, semi-suppressive PCR and assemblyPCR.

In some embodiments, amplification of the labeled nucleic acidscomprises non-PCR based methods. Examples of non-PCR based methodsinclude, but are not limited to, multiple displacement amplification(MDA), transcription-mediated amplification (TMA), nucleic acidsequence-based amplification (NASBA), strand displacement amplification(SDA), real-time SDA, rolling circle amplification, or circle-to-circleamplification. Other non-PCR-based amplification methods includemultiple cycles of DNA-dependent RNA polymerase-driven RNA transcriptionamplification or RNA-directed DNA synthesis and transcription to amplifyDNA or RNA targets, a ligase chain reaction (LCR), a Qβ replicase (Qβ),use of palindromic probes, strand displacement amplification,oligonucleotide-driven amplification using a restriction endonuclease,an amplification method in which a primer is hybridized to a nucleicacid sequence and the resulting duplex is cleaved prior to the extensionreaction and amplification, strand displacement amplification using anucleic acid polymerase lacking 5′ exonuclease activity, rolling circleamplification, and/or ramification extension amplification (RAM).

In some embodiments, the methods disclosed herein further compriseconducting a nested polymerase chain reaction on the amplified amplicon(e.g., target). The amplicon can be double-stranded molecule. Thedouble-stranded molecule can comprise a double-stranded RNA molecule, adouble-stranded DNA molecule, or a RNA molecule hybridized to a DNAmolecule. One or both of the strands of the double-stranded molecule cancomprise a sample tag or molecular identifier label. Alternatively, theamplicon can be a single-stranded molecule. The single-stranded moleculecan comprise DNA, RNA, or a combination thereof. The nucleic acids ofthe present invention can comprise synthetic or altered nucleic acids.

In some embodiments, the method comprises repeatedly amplifying thelabeled nucleic acid to produce multiple amplicons. The methodsdisclosed herein can comprise conducting at least about 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 amplificationreactions. Alternatively, the method comprises conducting at least about25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100amplification reactions.

Amplification can further comprise adding one or more control nucleicacids to one or more samples comprising a plurality of nucleic acids.Amplification can further comprise adding one or more control nucleicacids to a plurality of nucleic acids. The control nucleic acids cancomprise a control label.

Amplification can comprise use of one or more non-natural nucleotides.Non-natural nucleotides can comprise photolabile and/or triggerablenucleotides. Examples of non-natural nucleotides include, but are notlimited to, peptide nucleic acid (PNA), morpholino and locked nucleicacid (LNA), as well as glycol nucleic acid (GNA) and threose nucleicacid (TNA). Non-natural nucleotides can be added to one or more cyclesof an amplification reaction. The addition of the non-naturalnucleotides can be used to identify products as specific cycles or timepoints in the amplification reaction.

Conducting the one or more amplification reactions can comprise the useof one or more primers. The one or more primers can comprise one or moreoligonucleotides. The one or more oligonucleotides can comprise at leastabout 7-9 nucleotides. The one or more oligonucleotides can compriseless than 12-15 nucleotides. The one or more primers can anneal to atleast a portion of the plurality of labeled nucleic acids. The one ormore primers can anneal to the 3′ end and/or 5′ end of the plurality oflabeled nucleic acids. The one or more primers can anneal to an internalregion of the plurality of labeled nucleic acids. The internal regioncan be at least about 50, 100, 150, 200, 220, 230, 240, 250, 260, 270,280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550,560, 570, 580, 590, 600, 650, 700, 750, 800, 850, 900 or 1000nucleotides from the 3′ ends the plurality of labeled nucleic acids. Theone or more primers can comprise a fixed panel of primers. The one ormore primers can comprise at least one or more custom primers. The oneor more primers can comprise at least one or more control primers. Theone or more primers can comprise at least one or more housekeeping geneprimers. The one or more oligonucleotides can comprise a sequenceselected from a group consisting of sequences in Table 23. The one ormore primers can comprise a universal primer. The universal primer cananneal to a universal primer binding site. The one or more customprimers can anneal to the first sample tag, the second sample tag, themolecular identifier label, the nucleic acid or a product thereof. Theone or more primers can comprise a universal primer and a custom primer.The custom primer can be designed to amplify one or more target nucleicacids. The target nucleic acids can comprise a subset of the totalnucleic acids in one or more samples. In some embodiments, the primersare the probes attached to the array of the disclosure.

In some embodiments, stochastically barcoding the plurality of targetsin the sample further comprises generating an indexed library of thestochastically barcoded targets. The molecular labels of differentstochastic barcodes can be different from one another. Generating anindexed library of the stochastically barcoded targets includesgenerating a plurality of indexed polynucleotides from the plurality oftargets in the sample. For example, for an indexed library of thestochastically barcoded targets comprising a first indexed target and asecond indexed target, the label region of the first indexedpolynucleotide can differ from the label region of the second indexedpolynucleotide by at least one, two, three, four, or five nucleotides.In some embodiments, generating an indexed library of the stochasticallybarcoded targets includes contacting a plurality of targets, for examplemRNA molecules, with a plurality of oligonucleotides including a poly(T)region and a label region; and conducting a first strand synthesis usinga reverse transcriptase to produce single-strand labeled cDNA moleculeseach comprising a cDNA region and a label region, wherein the pluralityof targets includes at least two mRNA molecules of different sequencesand the plurality of oligonucleotides includes at least twooligonucleotides of different sequences. Generating an indexed libraryof the stochastically barcoded targets can further comprise amplifyingthe single-strand labeled cDNA molecules to produce double-strandlabeled cDNA molecules; and conducting nested PCR on the double-strandlabeled cDNA molecules to produce labeled amplicons. In someembodiments, the method can include generating an adaptor-labeledamplicon.

Stochastic barcoding can use nucleic acid barcodes or tags to labelindividual nucleic acid (e.g., DNA or RNA) molecules. In someembodiments, it involves adding DNA barcodes or tags to cDNA moleculesas they are generated from mRNA. Nested PCR can be performed to minimizePCR amplification bias. Adaptors can be added for sequencing using, forexample, next generation sequencing (NGS).

FIG. 4 is a schematic illustration showing a non-limiting exemplaryprocess of generating an indexed library of the stochastically barcodedtargets, for example mRNAs. As shown in step 1, the reversetranscription process can encode each mRNA molecule with a uniquemolecular label, a spatial label, and a universal PCR site. Inparticular, RNA molecules 402 can be reverse transcribed to producelabeled cDNA molecules 404, including a cDNA region 406, by thestochastic hybridization of a set of molecular identifier labels 410 tothe poly(A) tail region 408 of the RNA molecules 402. Each of themolecular identifier labels 410 can comprise a target-binding region,for example a poly (dT) region 412, a label region 414, and a universalPCR region 416.

In some embodiments, the spatial label can include 3 to 20 nucleotides.In some embodiments, the molecular label can include 3 to 20nucleotides. In some embodiments, each of the plurality of stochasticbarcodes further comprises one or more of a universal label and acellular label, wherein universal labels are the same for the pluralityof stochastic barcodes on the solid support and cellular labels are thesame for the plurality of stochastic barcodes on the solid support. Insome embodiments, the universal label can include 3 to 20 nucleotides.In some embodiments, the cellular label comprises 3 to 20 nucleotides.

In some embodiments, the label region 414 can include a molecular label418 and a spatial label 420. In some embodiments, the label region 414can include one or more of a universal label, a dimension label, and acellular label. The molecular label 418 can be, or can be about, 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or anumber or a range between any of these values, of nucleotides in length.The spatial label 420 can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or a number or a rangebetween any of these values, of nucleotides in length. The universallabel can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30,40, 50, 60, 70, 80, 90, 100, or a number or a range between any of thesevalues, of nucleotides in length. Universal labels can be the same forthe plurality of stochastic barcodes on the solid support and cellularlabels are the same for the plurality of stochastic barcodes on thesolid support. The dimension label can be, or can be about, 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or a number or arange between any of these values, of nucleotides in length.

In some embodiments, the label region 414 can comprise 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500,600, 700, 800, 900, 1000, or a number or a range between any of thesevalues, different labels, such as a molecular label 418 and a spatiallabel 420. Each label can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or a number or a rangebetween any of these values, of nucleotides in length. A set ofmolecular identifier labels 410 can contain 10, 20, 40, 50, 70, 80, 90,10², 10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, 10¹⁰, 10¹¹, 10¹², 10¹³, 10¹⁴,10¹⁵, 10²⁰, or a number or a range between any of these values,molecular identifier labels 410. And the set of molecular identifierlabels 410 can, for example, each contain a unique label region 414. Thelabeled cDNA molecules 404 can be purified to remove excess molecularidentifier labels 410. Purification can comprise Ampure beadpurification.

As shown in step 2, products from the reverse transcription process instep 1 can be pooled into 1 tube and PCR amplified with a 1^(st) PCRprimer pool and a 1^(st) universal PCR primer. Pooling is possiblebecause of the unique label region 414. In particular, the labeled cDNAmolecules 404 can be amplified to produce nested PCR labeled amplicons422. Amplification can comprise multiplex PCR amplification.Amplification can comprise a multiplex PCR amplification with 96multiplex primers in a single reaction volume. In some embodiments,multiplex PCR amplification can utilize 10, 20, 40, 50, 70, 80, 90, 10²,10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, 10¹⁰, 10¹¹, 10¹², 10¹³, 10¹⁴, 10¹⁵,10²⁰, or a number or a range between any of these values, multiplexprimers in a single reaction volume. Amplification can comprise 1^(st)PCR primer pool 424 of custom primers 426A-C targeting specific genesand a universal primer 428. The custom primers 426 can hybridize to aregion within the cDNA portion 406′ of the labeled cDNA molecule 404.The universal primer 428 can hybridize to the universal PCR region 416of the labeled cDNA molecule 404.

As shown in step 3 of FIG. 4, products from PCR amplification in step 2can be amplified with a nested PCR primers pool and a 2^(nd) universalPCR primer. Nested PCR can minimize PCR amplification bias. Inparticular, the nested PCR labeled amplicons 422 can be furtheramplified by nested PCR. The nested PCR can comprise multiplex PCR withnested PCR primers pool 430 of nested PCR primers 432A-C and a 2^(nd)universal PCR primer 428′ in a single reaction volume. The nested PCRprimer pool 428 can contain, or contain about, 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600,700, 800, 900, 1000, or a number or a range between any of these values,different nested PCR primers 430. The nested PCR primers 432 can containan adaptor 434 and hybridize to a region within the cDNA portion 406″ ofthe labeled amplicon 422. The universal primer 428′ can contain anadaptor 436 and hybridize to the universal PCR region 416 of the labeledamplicon 422. Thus, step 3 produces adaptor-labeled amplicon 438. Insome embodiments, nested PCR primers 432 and the 2^(nd) universal PCRprimer 428′ may not contain the adaptors 434 and 436. The adaptors 434and 436 can instead be ligated to the products of nested PCR to produceadaptor-labeled amplicon 438.

As shown in step 4, PCR products from step 3 can be PCR amplified forsequencing using library amplification primers. In particular, theadaptors 434 and 436 can be used to conduct one or more additionalassays on the adaptor-labeled amplicon 438. The adaptors 434 and 436 canbe hybridized to primers 440 and 442. The one or more primers 440 and442 can be PCR amplification primers. The one or more primers 440 and442 can be sequencing primers. The one or more adaptors 434 and 436 canbe used for further amplification of the adaptor-labeled amplicons 438.The one or more adaptors 434 and 436 can be used for sequencing theadaptor-labeled amplicon 438. The primer 442 can contain a plate index444 so that amplicons generated using the same set of molecularidentifier labels 408 can be sequenced in one sequencing reaction usingNGS.

Sequencing

In some embodiments, estimating the number of the plurality of targetsusing the molecular label includes determining sequences of the spatiallabels and molecular labels of the plurality of the stochastic labelsand counting the number of the molecular labels with distinct sequences.Determining the sequences of the spatial labels and the molecular labelsof the plurality of the stochastic barcodes can include sequencing someor all of the plurality of stochastic barcodes. Sequencing some or allof the plurality of stochastic barcodes can include generating sequenceseach with a read length of 100 or more bases.

Determining the number of different stochastically labeled nucleic acidscan comprise determining the sequence of the labeled target, the spatiallabel, the molecular label, the sample label, and the cellular label orany product thereof (e.g. labeled-amplicons, labeled-cDNA molecules). Anamplified target can be subjected to sequencing. Determining thesequence of the stochastically labeled nucleic acid or any productthereof can comprise conducting a sequencing reaction to determine thesequence of at least a portion of a sample label, a spatial label, acellular label, a molecular label, at least a portion of thestochastically labeled target, a complement thereof, a reversecomplement thereof, or any combination thereof.

Determination of the sequence of a nucleic acid (e.g. amplified nucleicacid, labeled nucleic acid, cDNA copy of a labeled nucleic acid, etc.)can be performed using variety of sequencing methods including, but notlimited to, sequencing by hybridization (SBH), sequencing by ligation(SBL), quantitative incremental fluorescent nucleotide additionsequencing (QIFNAS), stepwise ligation and cleavage, fluorescenceresonance energy transfer (FRET), molecular beacons, TaqMan reporterprobe digestion, pyrosequencing, fluorescent in situ sequencing(FISSEQ), FISSEQ beads, wobble sequencing, multiplex sequencing,polymerized colony (POLONY) sequencing; nanogrid rolling circlesequencing (ROLONY), allele-specific oligo ligation assays (e.g., oligoligation assay (OLA), single template molecule OLA using a ligatedlinear probe and a rolling circle amplification (RCA) readout, ligatedpadlock probes, or single template molecule OLA using a ligated circularpadlock probe and a rolling circle amplification (RCA) readout), and thelike.

In some embodiments, determining the sequence of the labeled nucleicacid or any product thereof comprises paired-end sequencing, nanoporesequencing, high-throughput sequencing, shotgun sequencing,dye-terminator sequencing, multiple-primer DNA sequencing, primerwalking, Sanger dideoxy sequencing, Maxim-Gilbert sequencing,pyrosequencing, true single molecule sequencing, or any combinationthereof. Alternatively, the sequence of the labeled nucleic acid or anyproduct thereof can be determined by electron microscopy or achemical-sensitive field effect transistor (chemFET) array.

High-throughput sequencing methods, such as cyclic array sequencingusing platforms such as Roche 454, Illumina Solexa, ABI-SOLiD, IONTorrent, Complete Genomics, Pacific Bioscience, Helicos, or thePolonator platform, can also be utilized. In some embodiment, sequencingcan comprise MiSeq sequencing. In some embodiment, sequencing cancomprise HiSeq sequencing.

The stochastically labeled targets can comprise nucleic acidsrepresenting from about 0.01% of the genes of an organism's genome toabout 100% of the genes of an organism's genome. For example, about0.01% of the genes of an organism's genome to about 100% of the genes ofan organism's genome can be sequenced using a target complimentaryregion comprising a plurality of multimers by capturing the genescontaining a complimentary sequence from the sample. In someembodiments, the labeled nucleic acids comprise nucleic acidsrepresenting from about 0.01% of the transcripts of an organism'stranscriptome to about 100% of the transcripts of an organism'stranscriptome. For example, about 0.501% of the transcripts of anorganism's transcriptome to about 100% of the transcripts of anorganism's transcriptome can be sequenced using a target complimentaryregion comprising a poly-T tail by capturing the mRNAs from the sample.

Determining the sequences of the spatial labels and the molecular labelsof the plurality of the stochastic barcodes can include sequencing0.00001%, 0.0001%, 0.001%, 0.01%, 0.1%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%,9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99%, 100%, or anynumber or range between two of these values, of the plurality ofstochastic barcodes. Determining the sequences of the spatial labels andthe molecular labels of the plurality of the stochastic barcodes caninclude sequencing 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 10³, 10⁴,10⁵, 10⁶, 10⁷, 10⁸, 10⁹, 10¹⁰, 10¹¹, 10¹², 10¹³, 10¹⁴, 10¹⁵, 10¹⁶, 10¹⁷,10¹⁸, 10¹⁹, 10²⁰, or any number or range between two of these values, ofthe plurality of stochastic barcodes. Sequencing some or all of theplurality of stochastic barcodes can include generating sequences eachwith a read length of 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300,400, 500, 600, 700, 800, 900, 1000, 2000, 3000, 4000, 5000, 6000, 7000,8000, 9000, 10000, or any number or range between two of these values,of nucleotides or bases.

Sequencing can comprise sequencing at least about 10, 20, 30, 40, 50,60, 70, 80, 90, 100 or more nucleotides or base pairs of the labelednucleic acid. Sequencing can comprise sequencing at least about 200,300, 400, 500, 600, 700, 800, 900, 1,000 or more nucleotides or basepairs of the labeled nucleic acid. Sequencing can comprise sequencing atleast about 1,500; 2,000; 3,000; 4,000; 5,000; 6,000; 7,000; 8,000;9,000; or 10,000 or more nucleotides or base pairs of the labelednucleic acid.

Sequencing can comprise at least about 200, 300, 400, 500, 600, 700,800, 900, 1,000 or more sequencing reads per run. In some embodiments,sequencing comprises sequencing at least about 1,500; 2,000; 3,000;4,000; 5,000; 6,000; 7,000; 8,000; 9,000; or 10,000 or more sequencingreads per run. Sequencing can comprise less than or equal to about1,600,000,000 sequencing reads per run. Sequencing can comprise lessthan or equal to about 200,000,000 reads per run.

Samples

A sample for use in the method of the disclosure can comprise one ormore cells. A sample can refer to one or more cells. In someembodiments, the plurality of cells can include one or more cell types.At least one of the one or more cell types can be brain cell, heartcell, cancer cell, circulating tumor cell, organ cell, epithelial cell,metastatic cell, benign cell, primary cell, circulatory cell, or anycombination thereof. In some embodiments, the cells are cancer cellsexcised from a cancerous tissue, for example, breast cancer, lungcancer, colon cancer, prostate cancer, ovarian cancer, pancreaticcancer, brain cancer, melanoma and non-melanoma skin cancers, and thelike. In some embodiments, the cells are derived from a cancer butcollected from a bodily fluid (e.g. circulating tumor cells).Non-limiting examples of cancers can include, adenoma, adenocarcinoma,squamous cell carcinoma, basal cell carcinoma, small cell carcinoma,large cell undifferentiated carcinoma, chondrosarcoma, and fibrosarcoma.The sample can include a tissue, a cell monolayer, fixed cells, a tissuesection, or any combination thereof. The sample can include a biologicalsample, a clinical sample, an environmental sample, a biological fluid,a tissue, or a cell from a subject. The sample can be obtained from ahuman, a mammal, a dog, a rat, a mouse, a fish, a fly, a worm, a plant,a fungus, a bacterium, a virus, a vertebrate, or an invertebrate.

In some embodiments, the cells are cells that have been infected withvirus and contain viral oligonucleotides. In some embodiments, the viralinfection can be caused by a virus selected from the group consisting ofdouble-stranded DNA viruses (e.g. adenoviruses, herpes viruses, poxviruses), single-stranded (+ strand or “sense”) DNA viruses (e.g.parvoviruses), double-stranded RNA viruses (e.g. reoviruses),single-stranded (+ strand or sense) RNA viruses (e.g. picornaviruses,togaviruses), single-stranded (− strand or antisense) RNA viruses (e.g.orthomyxoviruses, rhabdoviruses), single-stranded ((+ strand or sense)RNA viruses with a DNA intermediate in their life-cycle) RNA-RT viruses(e.g. retroviruses), and double-stranded DNA-RT viruses (e.g.hepadnaviruses). Exemplary viruses can include, but are not limited to,SARS, HIV, coronaviruses, Ebola, Malaria, Dengue, Hepatitis C, HepatitisB, and Influenza.

In some embodiments, the cells are bacteria. These can include eithergram-positive or gram-negative bacteria. Examples of bacteria that canbe analyzed using the disclosed methods, devices, and systems include,but are not limited to, Actinomedurae, Actinomyces israelii, Bacillusanthracis, Bacillus cereus, Clostridium botulinum, Clostridiumdifficile, Clostridium perfringens, Clostridium tetani, Corynebacterium,Enterococcus faecalis, Listeria monocytogenes, Nocardia,Propionibacterium acnes, Staphylococcus aureus, Staphylococcus epiderm,Streptococcus mutans, Streptococcus pneumoniae and the like. Gramnegative bacteria include, but are not limited to, Afipia felis,Bacteroides, Bartonella bacilliformis, Bortadella pertussis, Borreliaburgdorferi, Borrelia recurrentis, Brucella, Calymmatobacteriumgranulomatis, Campylobacter, Escherichia coli, Francisella tularensis,Gardnerella vaginalis, Haemophilius aegyptius, Haemophilius ducreyi,Haemophilius influenziae, Heliobacter pylori, Legionella pneumophila,Leptospira interrogans, Neisseria meningitidia, Porphyromonasgingivalis, Providencia sturti, Pseudomonas aeruginosa, Salmonellaenteridis, Salmonella typhi, Serratia marcescens, Shigella boydii,Streptobacillus moniliformis, Streptococcus pyogenes, Treponemapallidum, Vibrio cholerae, Yersinia enterocolitica, Yersinia pestis andthe like. Other bacteria can include Myobacterium avium, Myobacteriumleprae, Myobacterium tuberculosis, Bartonella henseiae, Chlamydiapsittaci, Chlamydia trachomatis, Coxiella burnetii, Mycoplasmapneumoniae, Rickettsia akari, Rickettsia prowazekii, Rickettsiarickettsii, Rickettsia tsutsugamushi, Rickettsia typhi, Ureaplasmaurealyticum, Diplococcus pneumoniae, Ehrlichia chafensis, Enterococcusfaecium, Meningococci and the like.

In some embodiments, the cells are fungi. Non-limiting examples of fungithat can be analyzed using the disclosed methods, devices, and systemsinclude, but are not limited to, Aspergilli, Candidae, Candida albicans,Coccidioides immitis, Cryptococci, and combinations thereof.

In some embodiments, the cells are protozoans or other parasites.Examples of parasites to be analyzed using the methods, devices, andsystems of the present disclosure include, but are not limited to,Balantidium coli, Cryptosporidium parvum, Cyclospora cayatanensis,Encephalitozoa, Entamoeba histolytica, Enterocytozoon bieneusi, Giardialamblia, Leishmaniae, Plasmodii, Toxoplasma gondii, Trypanosomae,trapezoidal amoeba, worms (e.g., helminthes), particularly parasiticworms including, but not limited to, Nematoda (roundworms, e.g.,whipworms, hookworms, pinworms, ascarids, filarids and the like),Cestoda (e.g., tapeworms).

As used herein, the term “cell” can refer to one or more cells. In someembodiments, the cells are normal cells, for example, human cells indifferent stages of development, or human cells from different organs ortissue types (e.g. white blood cells, red blood cells, platelets,epithelial cells, endothelial cells, neurons, glial cells, fibroblasts,skeletal muscle cells, smooth muscle cells, gametes, or cells from theheart, lungs, brain, liver, kidney, spleen, pancreas, thymus, bladder,stomach, colon, small intestine). In some embodiments, the cells can beundifferentiated human stem cells, or human stem cells that have beeninduced to differentiate. In some embodiments, the cells can be fetalhuman cells. The fetal human cells can be obtained from a motherpregnant with the fetus. In some embodiments, the cells are rare cells.A rare cell can be, for example, a circulating tumor cell (CTC),circulating epithelial cell, circulating endothelial cell, circulatingendometrial cell, circulating stem cell, stem cell, undifferentiatedstem cell, cancer stem cell, bone marrow cell, progenitor cell, foamcell, mesenchymal cell, trophoblast, immune system cell (host or graft),cellular fragment, cellular organelle (e.g. mitochondria or nuclei),pathogen infected cell, and the like.

In some embodiments, the cells are non-human cells, for example, othertypes of mammalian cells (e.g. mouse, rat, pig, dog, cow, or horse). Insome embodiments, the cells are other types of animal or plant cells. Inother embodiments, the cells can be any prokaryotic or eukaryotic cells.

In some embodiments, a first cell sample is obtained from a person nothaving a disease or condition, and a second cell sample is obtained froma person having the disease or condition. In some embodiments, thepersons are different. In some embodiments, the persons are the same butcell samples are taken at different time points. In some embodiments,the persons are patients, and the cell samples are patient samples. Thedisease or condition can be a cancer, a bacterial infection, a viralinfection, an inflammatory disease, a neurodegenerative disease, afungal disease, a parasitic disease, a genetic disorder, or anycombination thereof.

In some embodiments, cells suitable for use in the presently disclosedmethods can range in size from about 2 micrometers to about 100micrometers in diameter. In some embodiments, the cells can havediameters of at least 2 micrometers, at least 5 micrometers, at least 10micrometers, at least 15 micrometers, at least 20 micrometers, at least30 micrometers, at least 40 micrometers, at least 50 micrometers, atleast 60 micrometers, at least 70 micrometers, at least 80 micrometers,at least 90 micrometers, or at least 100 micrometers. In someembodiments, the cells can have diameters of at most 100 micrometers, atmost 90 micrometers, at most 80 micrometers, at most 70 micrometers, atmost 60 micrometers, at most 50 micrometers, at most 40 micrometers, atmost 30 micrometers, at most 20 micrometers, at most 15 micrometers, atmost 10 micrometers, at most 5 micrometers, or at most 2 micrometers.The cells can have a diameter of any value within a range, for examplefrom about 5 micrometers to about 85 micrometers. In some embodiments,the cells have diameters of about 10 micrometers.

In some embodiments the cells are sorted prior to associating a cellwith a bead. For example the cells can be sorted byfluorescence-activated cell sorting or magnetic-activated cell sorting,or more generally by flow cytometry. The cells can be filtered by size.In some embodiments a retentate contains the cells to be associated withthe bead. In some embodiments the flow through contains the cells to beassociated with the bead.

A sample can refer to a plurality of cells. The sample can refer to amonolayer of cells. The sample can refer to a thin section (e.g., tissuethin section). The sample can refer to a solid or semi-solid collectionof cells that can be place in one dimension on an array.

Resolution of Spatial Labels

The methods of the disclosure relate to the relationship between theresolution of spatial labels and the size and/or spacing of thestochastic barcodes (e.g., cells). When samples are larger the spacingof spatial labels, the resolution of targets in the sample can behigher. When samples are smaller than the spacing of spatial labels theresolution of the location of targets in the sample can be lower.

The stochastic barcodes can be spaced at a distance at least 5, 10, 15,20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100%of the longest dimension of the sample. The stochastic barcodes can bespaced at a distance at most 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55,60, 65, 70, 75, 80, 85, 90, 95, or 100% of the longest dimension of thesample. The stochastic barcodes can be spaced at a distance at least 5,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,or 100% of the shortest dimension of the sample. The stochastic barcodescan be spaced at a distance at most 5, 10, 15, 20, 25, 30, 35, 40, 45,50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100% of the shortestdimension of the sample.

A sample can associate with one or more types of stochastic barcodes,wherein each type of stochastic barcode comprises a different spatiallabel. A sample can associate with at least 1, 2, 3, 4, 5, 6, 7, 8, 9,or 10 or more types of stochastic barcodes (e.g., different spatiallabels). A sample can associate with at most 1, 2, 3, 4, 5, 6, 7, 8, 9,or 10 or more types of stochastic barcodes (e.g., different spatiallabels). The number of types of stochastic barcodes to which a samplecan associate with can be related to the spacing of the barcodesrelative to the size of the sample.

In some embodiments, the methods of the disclosure relate to therelationship between the resolution of spatial labels and the spacing ofthe samples. When samples are spaced far apart (e.g., on a substrate),the spatial resolution of the targets in the sample can be higherbecause diffusion between samples may not contaminate the samples. Whensamples are spaced close together (e.g., on a substrate), the spatialresolution of the targets in the sample can be lower because diffusionof targets between the samples can contaminate a neighboring sample.

The samples can be spaced at least 1, 100, 200, 300, 400, 500, 600, 700,800, 900 or more micrometers apart. The samples can be spaced at most 1,100, 200, 300, 400, 500, 600, 700, 800, 900 or more micrometers apart.The samples can be spaced at least 1, 100, 200, 300, 400, 500, 600, 700,800, 900 or more millimeters apart. The samples can be spaced at most 1,100, 200, 300, 400, 500, 600, 700, 800, 900 or more millimeters apart.The samples can be spaced at least 1, 100, 200, 300, 400, 500, 600, 700,800, 900 or more meters apart. The samples can be spaced at most 1, 100,200, 300, 400, 500, 600, 700, 800, 900 or more meters apart.

Targets from a sample can diffuse at least 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95,or 100 or more nanometers. Targets from a sample can diffuse at most 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65,70, 75, 80, 85, 90, 95, or 100 or more nanometers. Targets from a samplecan diffuse at least 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1or more millimeters. Targets from a sample can diffuse at most 0.1, 0.2,0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, or 1 or more millimeters.

Methods for Spatial Identification of a Nucleic Acid in a Sample

Disclosed herein are methods for determining spatial locations of aplurality of targets in a sample. In some embodiments, the methodsinclude: imaging the sample to generate a sample image; stochasticallybarcoding the plurality of targets in the sample using a plurality ofstochastic barcodes to generate stochastically barcoded targets, whereineach of the plurality of stochastic barcodes comprises a spatial label;and identifying the spatial location of each of the plurality of targetsusing the spatial label. Identifying the spatial location of each of theplurality of targets using the spatial label can include correlating thesample image with the spatial labels of the plurality of targets in thesample. Imaging the sample can include staining the sample with a stain,wherein the stain is a fluorescent stain, a negative stain, an antibodystain, or any combination thereof. Imaging the sample can includeimaging the sample using optical microscopy, electron microscopy,confocal microscopy, fluorescence microscopy, or any combinationthereof. Correlating the sample image with the spatial labels of theplurality of targets in the sample can include overlaying the sampleimage with the spatial labels of the plurality of targets in the sample.The sample can include a biological sample, a clinical sample, anenvironmental sample, a biological fluid, a tissue, or a cell from asubject. In some embodiments, the methods can include determininggenotype, phenotype, or one or more genetic mutations of the subjectbased on the spatial labels of the plurality of targets in the sample.In some embodiments, the methods can include predicting susceptibilityof the subject to one or more diseases. At least one of the one or morediseases can be cancer or a hereditary disease. The sample can include aplurality of cells and the plurality of targets can be associated withthe plurality of cells. The plurality of cells can include one or morecell types. In some embodiments, the methods can include determiningcell types of the plurality of cells in the sample. The drug can bechosen based on predicted responsiveness of the cell types of theplurality of cells in the sample.

Imaging

The sample contacted to the substrate can be analyzed (e.g., withimmunohistochemistry, staining and/or imaging). Exemplary methods ofimmunohistochemistry can comprise a step of reacting a labeled probebiological substance obtained by introducing a label into a substancecapable of recognizing a biological substance to be detected to a tissuesection, to visualize the biological substance to be detected present onthe tissue section via a specific binding reaction between thebiological substances.

For histology specimens, the tissue pieces can be fixed in a suitablefixative, typically formalin, and embedded in melted paraffin wax. Thewax block can be cut on a microtome to yield a thin slice of paraffincontaining the tissue. The specimen slice can be applied to a substrate,air dried, and heated to cause the specimen to adhere to the glassslide. Residual paraffin can be dissolved with a suitable solvent,typically xylene, toluene, or others. These so-called deparaffinizingsolvents can be removed with a washing-dehydrating type reagent prior tostaining. Slices can be prepared from frozen specimens, fixed briefly in10% formalin, then infused with dehydrating reagent. The dehydratingreagent can be removed prior to staining with an aqueous stain.

In some embodiments, the Papanicolaou staining technique can be used(e.g., a progressive stain and/or hematoxylineosin [H&E], i.e., aregressive stain). HE (hematoxylin-eosin) stain uses hematoxylin andeosin as a dye. Hematoxylin is a blue-violet dye, and has a property ofstaining basophilic tissues such as cell nuclei, bone tissues, part ofcartilage tissues, and serous components. Eosin is a red to pink dye,and has a property of staining eosinophilic tissues such as cytoplasm,connective tissues of the softtissue, red blood cells, fibrin, andendocrine granules.

Immunohistochemistry (IHC) can be referred to as “immunologicalstaining” due to the process of color development for visualizing anantigen-antibody reaction which is otherwise invisible (hereinafter, theterm “immunohistochemical staining” can be used forimmunohistochemistry). Lectin staining is a technique that can use aproperty of lectin of binding to a specific sugar chain in anon-immunological and specific manner in order to detect a sugar chainin a tissue specimen using lectin.

HE staining, immunohistochemistry and lectin staining can be used fordetecting a location of, for example, cancer cells in a cell specimen.For example, when it is desired to confirm a location of cancer cells ina cell specimen, a pathologist, in order to determine the presence orabsence of cancer cells in the cell specimen, can prepare tissuesections and place them on a substrate of the disclosure. The section onthe array can be subjected to HE staining, imaging, or anyimmunohistochemical analysis in order to obtain its morphologicalinformation and/or any other identifying features (such as presence orabsence of rare cells). The sample can be lysed and the presence orabsence of nucleic acid molecules can be determined using the methods ofthe disclosure. The nucleic acid information can be compared (e.g.,spatially compared) to the image, thereby indicating the spatiallocation of nucleic acids in a sample.

In some embodiments, the tissue is stained with a staining enhancer(e.g., a chemical penetrant enhancer). Examples of tissue chemicalpenetrant enhancers that facilitate penetration of the stain into thetissue include, but are not limited to, polyethylene glycol (PEG),surfactants such as polyoxyethylenesorbitans, polyoxyethylene ethers(polyoxyethylenesorbitan monolaurate (Tween 20) and other Tweenderivatives, polyoxyethylene 23 lauryl ether (Brij 35), Triton X-100,Brij 35, Nonidet P-40, detergent-like substances such as lysolecithins,saponins, non-ionic detergents such as TRITON® X-100, etc., aproticsolvents such as dimethyl sulfoxide (DMSO), ethers such astetrahydrofuran, dioxane, etc.; esters such as ethyl acetate, butylacetate, isopropyl acetate; hydrocarbons such as toluene, chlorinatedsolvents such as dichloromethane, dichloroethane, chlorobenzene, etc.;ketones such as acetone, nitriles such as acetonitrile, and/or otheragents that increase cell membrane permeability.

In some embodiments, a composition is provided that facilitates stainingof a mammalian tissue sample. The composition can comprise a stain, suchas hematoxylin, or hematoxylin and eosin-Y, at least one tissue chemicalpenetrant enhancer, such as a surfactant, an aprotic solvent, and/orPEG, or any combination thereof.

In some embodiments, the sample is imaged (e.g., either before or afterIHC or without IHC). Imaging can comprise microscopy such as brightfield imaging, oblique illumination, dark field imaging, dispersionstaining, phase contrast, differential interference contrast,interference reflection microscopy, fluorescence, confocal, electronmicroscopy, transmission electron microscopy, scanning electronmicroscopy, and single plane illumination, or any combination thereof.Imaging can comprise the use of a negative stain (e.g., nigrosin,ammonium molybdate, uranyl acetate, uranyl formate, phosphotungsticacid, osmium tetroxide). Imaging can comprise the use of heavy metals(e.g., gold, osmium) that can scatter electrons.

Imaging can comprise imaging a portion of the sample (e.g.,slide/array). Imaging can comprise imaging at least 10, 20, 30, 40, 50,60, 70, 80, 90, or 100% of the sample. Imaging can comprise imaging atmost 10, 20, 30, 40, 50, 60, 70, 80, 90, or 100% of the sample. Imagingcan be done in discrete steps (e.g., the image may not need to becontiguous). Imaging can comprise taking at least 1, 2, 3, 4, 5, 6, 7,8, 9, or 10 or more different images. Imaging can comprise taking atmost 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more different images.

Detection

The substrate surface can be contacted with one or more targets underconditions that promote specific, high-affinity binding (i.e.,hybridization) of the target to one or more of the probes. The targetnucleic acids can hybridize with complementary nucleic acids of theknown oligonucleotide optical labels and thus, information about thetarget samples can be obtained. The targets can be labeled with anoptically detectable label, such as a fluorescent tag or fluorophore, sothat the targets are detectable with scanning equipment after ahybridization assay. The targets can be labeled either prior to, during,or even after the hybridization protocol, depending on the labelingsystem chosen, such that the fluorophore will associate only withprobe-bound hybridized targets.

The targets (e.g., molecules, amplified molecules) can be detected, forexample, using detection probes (e.g., fluorescent probes). The arraycan be hybridized with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or moredetection probes. The array can be hybridized with at most 1, 2, 3, 4,5, 6, 7, 8, 9, or 10 or more detection probes. In some embodiments, thearray is hybridized with 4 detection probes.

The detection probes can comprise a sequence complementary to a sequenceof a gene of interest. The length of the detection probe can be at least1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or morenucleotides. The length of the detection probe can be at most 1, 2, 3,4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 or more nucleotides. Thedetection probes can comprise a sequence that is perfectly complementaryto a sequence in a gene of interest (e.g., target). The detection probescan comprise a sequence that is imperfectly complementary to a sequencein a gene of interest (e.g., target). The detection probes can comprisea sequence with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or moremismatches to the sequence of the gene of interest. The detection probescan comprise a sequence with at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 ormore mismatches to the sequence of the gene of interest.

The detection probes can comprise a detectable label. Exemplarydetectable labels can comprise a fluorophore, chromophore, smallmolecule, nanoparticle, hapten, enzyme, antibody, and magnetic property,or any combination thereof.

Hybridized probes can be imaged. The image can be used to determine therelative expression level of the genes of interest based on theintensity of the detectable signal (e.g., fluorescent signal). Scanninglaser fluorescence microscopes or readers can be used to acquire digitalimages of the emitted light from substrate (e.g., microarray). A focusedlight source (usually a laser) can be scanned across the hybridizedsubstrate causing the hybridized areas to emit an optical signal, suchas fluorescence. The fluorophore-specific fluorescence data can becollected and measured during the scanning operation, and then an imageof the substrate can be reconstructed via appropriate algorithms,software and computer hardware. The expected or intended locations ofprobe nucleic acid features can then be combined with the fluorescenceintensities measured at those locations, to yield the data that is thenused to determine gene expression levels or nucleic acid sequence of thetarget samples. The process of collecting data from expected probelocations can be referred to as “feature extraction”. The digital imagescan be comprised of several thousand to hundreds of millions of pixelsthat typically range in size from 5 to 50 microns. Each pixel in thedigital image can be represented by a 16 bit integer, allowing for65,535 different grayscale values. The reader can sequentially acquirethe pixels from the scanned substrate and writes them into an image filewhich can be stored on a computer hard drive. The substrates can containseveral different fluorescently tagged probe DNA samples at each spotlocation. The scanner repeatedly scans the entire substrate with a laserof the appropriate wavelength to excite each of the probe DNA samplesand store them in their separate image files. The image files areanalyzed and subsequently viewed with the aid of a programmed computer.

The substrate can be imaged with a confocal laser scanner. The scannercan scan the substrate slide to produce one image for each dye used bysequentially scanning the with a laser of a proper wavelength for theparticular dye. Each dye can have a known excitation spectra and a knownemission spectra. The scanner can include a beam splitter which reflectsa laser beam towards an objective lens which, in turn, focuses the beamat the surface of slide to cause fluorescence spherical emission. Aportion of the emission can travel back through the lens and the beamsplitter. After traveling through the beam splitter, the fluorescencebeam can be reflected by a mirror, travels through an emission filter, afocusing detector lens and a central pinhole.

Correlation Between Probing and Imaging Data

The data from the substrate scan can be correlated to the image of theunlysed sample on the substrate. The data can be overlayed therebygenerating a map. A map of the location of targets from a sample can beconstructed using information generated using the methods describedherein. The map can be used to locate a physical location of a target.The map can be used to identify the location of multiple targets. Themultiple targets can be the same species of target, or the multipletargets can be multiple different targets. For example a map of a braincan be constructed to show the amount and location of multiple targets.

The map can be generated from data from a single sample. The map can beconstructed using data from multiple samples, thereby generating acombined map. The map can be constructed with data from tens, hundreds,and/or thousands of samples. A map constructed from multiple samples canshow a distribution of targets associated with regions common to themultiple samples. For example, replicated assays can be displayed on thesame map. At least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more replicatescan be displayed (e.g., overlaid) on the same map. At most 1, 2, 3, 4,5, 6, 7, 8, 9, or 10 or more replicates can be displayed (e.g.,overlaid) on the same map. The spatial distribution and number oftargets can be represented by a variety of statistics.

Combining data from multiple samples can increase the locationalresolution of the combined map. The orientation of multiple samples canbe registered by common landmarks and/or x-y positions on the array,wherein the individual locational measurements across samples are atleast in part non-contiguous. Multiplexing the above approach will allowfor high resolution maps of target nucleic acids in a sample.

The data analysis and correlation can be useful for determining thepresence and/or absence of a specific cell type (e.g., rare cell, cancercell). The data correlation can be useful for determining the relativeratios of target nucleic acids in distinct locations either within acell, or within a sample.

The methods and compositions disclosed herein can be companiondiagnostics for a medical professional (e.g., a pathologist) wherein asubject can be diagnosed by visually looking at a pathology image andcorrelating the image to genetic expression (e.g., identification ofexpression of oncogenes). The methods and compositions can be useful foridentifying a cell from a population of cells, and determining thegenetic heterogeneity of the cells within a sample. The methods andcompositions can be useful for determining the genotype of a sample.

The disclosure provides for methods for making replicates of substrates.The substrates can be reprobed with different probes for different genesof interest, or to selectively choose specific genes. For example, asample can be placed on a substrate comprising a plurality of oligo(dT)probes. mRNAs can hybridize to the probes. Replicate substratescomprising oligo(dT) probes can be contacted to the initial slide andmake replicates of the mRNAs. Replicate substrates comprising RNAgene-specific probes can be contacted to the initial slide to make areplicate.

The mRNA can be reverse transcribed into cDNA. The cDNA can behomopolymer tailed and/or amplified (e.g., via bridge amplification).The array can be contacted with a replicate array. The replicate arraycan comprise gene-specific probes that can bind to the cDNAs ofinterest. The replicate array can comprise polyA probes that can bind tocDNAs with a polyadenylation sequence.

The number of replicates that can be made can be at least 1, 2, 3, 4, 5,6, 7, 8, 9, or 10 or more. The number of replicates that can be made canbe at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more.

In some embodiments, the initial substrate comprises a plurality ofgene-specific probes and the replicate substrate comprises the samegene-specific probes, or different probes that correspond to the samegenes as the gene-specific probes.

Stochastic Barcoding with Physical Separation of Samples

In some embodiments, the sample can be physically divided or can beintact during stochastically barcoding the plurality of targets in thesample. The spatial locations of the plurality of targets in the samplecan be on a surface of the sample, inside the sample, subcellularly inthe sample, or any combination thereof. In some embodiments, stochasticbarcoding the plurality of targets in the sample can be performed on thesurface of the sample, subcellularly in the sample, inside the sample,or any combination thereof.

A sample can be physically separated into different containers. Physicalseparation can be accomplished by dissection, for example by physicallycutting a sample. Physical separation can be accomplished by sectioning,for example sectioning with a microtome. Physical separation can beaccomplished by using a blade grid (e.g., a substrate wherein the edgesof containers in the substrate are sharp such that they can cut asample, and wherein the pieces of the cut sample can fall into thecontainers on the substrate). A blade grid can simultaneously separateand physically isolate the parts of the samples.

The process of physical separation can preserve information about thephysically separated sample. Information preservation can occur byassociating a known part of the sample with a particular spatial labeland/or container. The containers can comprise spatial labels which canbe used to represent the original physical relationships present beforethe sample was separated. The spatial labels can then be associated withtargets within the parts of the physically separated samples. In thisway targets from an identifiable location within the sample can bestochastically labeled and digitally counted.

In a basic example, a sample, for example a solid tissue, can bebisected along a midsagittal plane. The right half of the organ can beplaced in one container. The left half of the organ can be placed in asecond container. A pool of non-depletable labels can be associated withtargets in each container. The labels can be used to stochasticallylabel targets within the sample. The labels can comprise a spatial labelwhich can be used to identify which targets were in each container. Thelabeled targets from each container can be recombined for analysis. Theanalysis can include an amplification step. The amplifiedlabeled-targets can be sequenced or hybridized to an array for analysis.The data generated from the analysis can include a stochastic count ofthe number of starting targets and, for each target, spatial informationregarding whether the target was to the left or right of the midsagittalbisection.

A sample can be physically separated into more than two sections. Asample can be divided into at least 2, 3, 4, 5, 6, 7, 8, 9, 10 or moresections. A sample can be divided into at most 2, 3, 4, 5, 6, 7, 8, 9,10 or more sections. A sample can be divided into hundreds of sections.A sample can be divided into at least 100, 200, 300, 400, 500, 600, 700,800, or 900 or more sections. A sample can be divided into at most 100,200, 300, 400, 500, 600, 700, 800, or 900 or more sections. A sample canbe divided into thousands of sections. A sample can be divided into atleast 1000, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000 or90000 or more sections. A sample can be divided into at most 1000,10000, 20000, 30000, 40000, 50000, 60000, 70000, 80000 or 90000 or moresections. A sample can be divided into 16, 32, 48, 96, or 384 sections.The higher the number of sections the sample is divided into the greaterthe spatial resolution imparted by the spatial labels.

The sections (e.g., of the solid tissue and/or comprising the targets)can be arranged such that a physical relationship of sections is similarto the physical relationship between containers on a substrate (e.g., agrid). For example a target that is in a “top right” section of a samplecan be located in a “top right” container. This can allow for a sampleto be directly applied to a substrate to preserve the physicalrelationship between the sections of the sample (e.g., solid tissue).FIG. 5 illustrates how a solid tissue 505 can be divided into sections510 (e.g., sections 1-10). The sections can be placed on a substrate 515(e.g., grid). The sections 510 of the solid tissue 505 can be placedonto a specific location within the substrate 515, wherein the specificlocation is a replicated map of the solid tissue 505. The placement ofsections of a solid tissue onto the substrate can be performed in two orthree dimensions. FIG. 5 illustrates an exemplary embodiment of a twodimensional substrate. In some embodiments, the substrate can bethree-dimensional. Tens, hundreds, thousands, or millions of sectionsfrom a sample can be reflected by physical locations of containers onthe substrate.

If the location of the first section in the sample is known, then thatinformation can be associated with targets within the containercontaining the section. For example, as shown in FIG. 5, after thesections 510 of the solid tissue 505 have been placed in containers onthe substrate 515, the targets of the sections can be stochasticallylabeled, amplified, and/or counted. The information (e.g., number oftypes of molecules) arising from container aa on the grid 515 cancorrespond 520 to physical section (e.g., location) 1 of the solidtissue 505. Similarly, the information arising from container ab of thesubstrate 515 can correspond 520 to physical section 4 of the solidtissue 505.

In some embodiments, the methods of the disclosure can be used foridentifying the surface of a sample. For example, spatial labels can beadded to the surface of sample (e.g., solid tissue). The sample can belysed, stochastically labeled with the spatial labels, amplified, and/ordigitally counted. Targets which were on the surface of the sample canbe distinguished from targets on the interior of the sample, based onthe spatial label. Identifying the surface of a sample, and/ordistinguishing between the interior and exterior of a sample can beuseful for determining boundaries of a solid tissue (e.g., tumor),determining if resection of a solid tissue was performed completely,and/or identifying boundaries of different physiological structure orcell types.

Spatial labels can be associated with a spatially intact sample. Forexample, a needle, or array of needles, can insert spatial labels intoan intact sample. Spatial labels can be inserted into an intact samplein a variety of ways, including but not limited to, needle insertion,pin insertion, insertion through blood capillaries, injection,electroporation, transduction, and transformation. The intact sample canthen be lysed for stochastic barcoding, amplification, and digitalcounting.

Stochastic Barcoding with Physical Separation of Samples Combined withTime Separation

In some embodiments, stochastically barcoding the plurality of targetsin the sample can include contacting the sample with a device. Thedevice can be a needle, a needle array, a tube, a suction device, aninjection device, an electroporation device, a fluorescent activatedcell sorter device, a microfluidic device, or any combination thereof.The device can contact sections of the sample at a specified rate. Thespecified rate can correlate the spatial locations of the plurality oftargets with the one or more time points.

Containers on a substrate can be filled in an order that reflects aphysical location. Spatial labels can be combined with a section as apart of the sample collection. For example a sampling device comprisinga suction device could be used to remove a sample in a predefinedpattern. As the section of the sample travels through the suction devicespatial labels can be associated with the section. The serial additionof the spatial labels can identify where the suction device was in spaceat the time of sectioning.

For example, as shown in FIG. 6, a solid tissue 605 can be divided intosections 610 with a sampling device. The sections 610 can be transportedto a substrate 615 based on the order in which they were obtained fromthe sample. For example, section 1 can be placed in a container in thesubstrate 615 which corresponds to time 1 (T1). The container whichcorresponds to T1 can comprise a time label (see below). The targets inthe sections can be stochastically labeled, amplified, and/or counted.The information (e.g., number of types of molecules) arising fromcontainer T1 on the substrate 615 can correspond 620 to physical section(e.g., location) 1 of the solid tissue 605. Similarly, the informationarising from container T4 of the substrate 615 can correspond 620 tophysical section 4 of the solid tissue 605. The time label can indicatethe physical location of the section in the sample by comparing the rateat which the sampling device processes each section from the sample tothe substrate. Obtainment of the sections by the sampling device can beperformed serially. The time labels can be added to sections and/orcontainers in serial before sections are added, simultaneously withaddition of the section to the container, or after the section is addedto the container.

For example, a needle array device can inject solid supports comprisingstochastic barcodes with spatial labels into a solid tissue. As theneedle retracts the solid support is left in the tissue. As the needleretracts further up a column of a solid tissue, a series of solidsupports can be placed in the tissue (e.g., along a column). The rate ofmovement of the device can be correlated to a time that the device wasin a specific position. In this way, the time a spatial label wasassociated with a target can be indicative of its position in thesample.

In another example, the sampling device can comprise a microfluidicchip. The sampling device can be capable of taking a section of asample, and placing it in a microfluidic chip. Inside the microfluidicchip the section can be encapsulated in an emulsion (e.g., droplet). Theemulsion can comprise a stochastic barcode with a spatial label. Theemulsion can be placed in a container of a substrate. The location inthe substrate in which the emulsion is placed can be indicative of thephysical location of the section in the sample because of theinformation carried in the time label.

Non-Physical Representation of Containers

The disclosure provides for a method for estimating the number ofmolecules in a specific location of a sample. The method can comprisedividing the sample into sections and stochastically labeling thesections with a barcode such that they contain information about thephysical location of the sections. The stochastic barcoding does nothave to occur in containers that have a similar physical relationship tothe sample, as described in FIGS. 5 and 6. The containers do not have asimilar physical relationship to the sample. The method do not need tomake use of containers. For example, as shown in FIG. 7, the sample 605can be divided into sections 710. The sections 710 can be placed intoone or more randomly located containers on a substrate 715, wherein thelocation of the section has no physical relationship to the physicalstructure, shape and/or morphology of the sample 705. The placement ofsections of a solid tissue onto the substrate can be performed in two orthree dimensions. FIG. 7 illustrates an exemplary embodiment of a twodimensional substrate. In some embodiments, the substrate can bethree-dimensional. Tens, hundreds, thousands, or millions of sectionsfrom a sample can be reflected by physical locations of containers onthe substrate.

If the location of the first section in the sample is known, then thatinformation can be associated with targets within the containercontaining the section. For example, as shown in FIG. 7, after thesections 710 of the solid tissue 705 have been placed in containers onthe substrate 715 the targets of the sections can be stochasticallylabeled, amplified, and/or counted. The information (e.g., number oftypes of molecules) arising from container aa on the grid 715 cancorrespond 720 to a physical section (e.g., location) of the solidtissue 605 (container aa corresponds to section 5, though it is locatedwhere section 1 of the sample is).

Addition of Stochastic Barcodes to Samples

Samples and/or sections of samples can be added to containers on asubstrate in parallel. A sampling device can obtain spatially knownsamples in parallel and then be used associate the samples with aspatial label. For example an array of biopsies can be obtained. Thebiopsies can be associated with labels on the device which obtains thebiopsies. The biopsies can be associated with labels after the biopsiesare put into containers. In some embodiments a needle array is used toobtain samples.

The spatial labels can be combined with a sample as a part of the samplecollection. For example a suction device could be used to remove asample in a predefined pattern. As the sample travels through thesuction device spatial labels can be associated with the sample. Theserial addition of the label can identify where the suction device wasin space at the time of collection.

For example a solid tumor can be resected. The resected tumor has itsexterior labeled with spatial labels, for example by spraying thesample, immersing the sample, or contact the sample with a compositioncomprising a spatial label.

Spatial Barcoding of Specific Cells

In some embodiments, spatial labels are delivered to a specific targetlocation. The target location can refer to a location in the body, aspecific type of cell, and/or a subcellular compartment. A spatial labelcan be associated with a molecule known to target a specific organ inthe body. For example, a spatial label can be associated with a moleculethat is processed in the liver, a spatial label can be associated with amolecule that can cross the blood brain barrier, a spatial label can beassociated with a molecule that can be taken up by blood capillaries.The molecule to which a spatial label is associated with can bring thespatial label in close proximity to a location in the body of interest.The location of the body of interest can be isolated, stochasticallylabeled with the spatial labeled, amplified, and/or digitally counted toobtain information about the number of targets in the location ofinterest.

A spatial label can be associated with a molecule known to target aspecific cell. For example, a spatial label can be associated with amolecule that targets an immune cell (e.g., a targeting molecule). Aspatial label can be associated with a molecule that targets a virus. Aspatial label can be associated with a molecule that targets the bloodbrain barrier. The molecule can be a targeting molecule that can bringthe spatial label to a location in a sample (e.g., subject).

A spatial label can be associated with a molecule known to target aspecific subcellular compartment. For example, a spatial label can beassociated with a vesicle which can comprise a location tag, such as forthe endoplasmic reticulum. The vesicle can deliver the spatial labelwithin close proximity of the endoplasmic reticulum. The endoplasmicreticulum can be isolated, stochastically labeled with the spatiallabel, amplified, and/or digitally counted.

Exemplary subcellular compartments can include, but are not limited tomitochondria, Golgi complex, cell wall, endoplasmic reticulum, nucleus,nucleolus, lysosomes, protein complexes (e.g., APC, lincRNAs), and thelike. Exemplary targeting molecules can include but are not limited tonuclear localization sequences, nuclear export sequences, chloroplastlocalization signals, mitochondrial localization signals, and the like.In some embodiments, the targeting molecule can comprise a vesicle.Exemplary vesicles can include liposomes, microsomes, nanodots, quantumdots, nanoparticles, and viral capsids, or any combination thereof.

Method of Label Lithography

The methods of the disclosure can provide for building a spatial labelafter the label has been constricted within and/or contacted to asample. In some embodiments, the methods include: stochasticallybarcoding the plurality of targets in the sample using a plurality ofstochastic barcodes, wherein each of the plurality of stochasticbarcodes comprises a pre-spatial label; concatenating one or morespatial label blocks onto the pre-spatial label to generate a spatiallabel; and identifying the spatial location of each of the plurality oftargets using the spatial label.

FIG. 8 shows an exemplary embodiment of the label lithography method ofthe disclosure. A target 804 can associate with a pre-spatial label 805.A pre-spatial label 805 can comprise a nucleotide sequence that canhybridize with targets of interest (e.g., gene specific nucleotidesequence or oligo(dT)) 810. The pre-spatial label 805 can comprise anactivatable consensus sequence 815. The activatable consensus sequence815 can be a nucleotide sequence that can be linked to anothernucleotide sequence or base. For example, an activatable sequence 815can be a restriction site, a site for TA-ligation, and/or aphoto-activatable nucleotide. The activatable consensus sequence 815 canbe linked to a spatial label block 820/821. A spatial label block820/821 can comprise a nucleotide sequence that is indicative of aspatial location 825. A spatial label block can comprise linkingsequences 830. Linking sequences 830 can interact with the activatableconsensus sequence 815 and/or other linking sequences in spatial labelblocks 820. For example, a first group (Group I) of spatial label blocks820/821 can comprise a first (A′) and second (B) linking sequence. Thefirst linking sequence (A′) can interact with the activatable consensussequence 815 in the pre-spatial label 705. A second group (Group II) ofspatial label blocks 821 can comprise a first (B′) and second (A)linking sequence. The first linking sequence (B′) can interact with thesecond linking sequence (B) of the first group of spatial label blocks821. In this way spatial label blocks 820/821 can be linked together.

Pre-Spatial Labels, Spatial Label Blocks, and Spatial Labels

A pre-spatial label can comprise a sequence that can associate with atarget of interest. A pre-spatial label can associate with nucleic acid,including but not limited to, DNA, mRNA, RNA fragments, gene-specificregions, and regulatory elements (e.g., promoter, enhancer). A sequencethat can bind to a target of interest can comprise a gene-specificregion (e.g., a nucleotide sequence that is adapted to bind to aspecific region of a gene), or a non-specific binding region (e.g.,oligo(dT), random hexamer, random oligomer). A sequence that canassociate with a target of interest can be at least 1, 2, 3, 4, 5, 6, 7,8, 9, or 10 or more nucleotides in length. A sequence that can associatewith a target of interest can be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or10 or more nucleotides in length.

A pre-spatial label can comprise a molecular label, a cellular label,and/or a sample label. A pre-spatial label can comprise a sequence thatcan associate (e.g., hybridize) with a target. A pre-spatial label canbe associated with a solid support. A pre-spatial label can beassociated with a substrate.

A pre-spatial label can comprise an activatable consensus sequence. Anactivatable consensus sequence can comprise a sequence that can beactivated to bind to a spatial label block. An activatable consensussequence can be a cleavable sequence (e.g., restriction endonucleasecleavage site), a sequence that can be tagged, and/or a sequenced thatcan be ligated. An activatable consensus sequence can comprise achemical moiety that can be activated. For example, the chemical moietycan comprise a fluorophore that can be excited, a photo-cleavablemoiety, a moiety that responds to magnets, and a binding moiety (e.g.,biotin/streptavidin).

An activatable consensus sequence can comprise at least 1, 2, 3, 4, 5,6, 7, 8, 9, or 10 or more nucleotides. An activatable consensus sequencecan comprise at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or morenucleotides.

A pre-spatial label can comprise a sequence that can associate with atarget of interest, a molecular label, a sample label and/or anactivatable consensus sequence. In some embodiments, a pre-spatial labelcomprises a sequence that can associate with a target of interest, amolecular label, a sample label and/or an activatable consensussequence.

A pre-spatial label can be at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15 16, 17, 18, 19, 20, 21, 222, 23, 24, 25, 26, 27, 28, 29,30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,48, 49, or 50 or more nucleotides in length. A pre-spatial label can beat most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 16, 17, 18,19, 20, 21, 222, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36,37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 or morenucleotides in length.

A spatial label block can comprise a sequence of nucleotides. Thesequence can comprise at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or morenucleotides. The sequence can comprise at most 1, 2, 3, 4, 5, 6, 7, 8,9, or 10 or more nucleotides. A spatial label block can be linked toanother (e.g., previous) spatial label block. Spatial label blocks canbe linked together, for example, by chemistry (e.g., click chemistry),ligation, split-pool synthesis, combinatorial chemistry, and/orphotoactivatable chemistry.

For example, a first group of spatial label blocks can comprise twosticky end sequences, wherein the sense strand has a first sticky endsequence, and the antisense strand has a second sticky end sequence onthe 5′ ends of each strand. A second group of spatial label blocks cancomprise two sticky end sequences, wherein the sense strand has thesecond complementary sticky end sequence and the antisense strand hasthe first complementary sticky end sequence. The first group can beligated to the second group. Only one ligation can occur. Subsequently,the first group can be contacted to the growing spatial label. The firstgroup can ligate to the sticky end of the second group. Only oneligation can occur. In this way, a spatial label can be able to belithographically produced on sample.

Linking of spatial label blocks can be performed before, during, and/orafter contacting the pre-spatial label with a target of interest.Pre-spatial labels can be associated with targets before, during and/orafter linking with spatial label blocks using chemical means such ascross-linking, hybridization to aid the association between thepre-spatial label and the target. This can reduce dissociation and/ordiffusion of the pre-spatial labels away from the targets.

Linking of spatial label blocks can be performed in a geometric mannersuch that the resulting length of the spatial label corresponds to thegeometric manner in which the spatial label blocks were linked. FIG. 9shows an exemplary embodiment of a geometric manner of linking spatiallabel blocks. For example, as one moves left to right on a sample,spatial label blocks can be increasingly added to the pre-spatial label,thereby generating spatial labels with different lengths. The length ofthe spatial label can correspond to a physical location in the sample. Asample 905 can be divided into sections 910. The sections can becontacted with a pre-spatial label. The pre-spatial label can becontacted with an integer number of spatial label blocks. For example,the sections in row a are contacted with one spatial label block. Thesections in row b are contacted with two spatial label blocks. Thesections in row c are contacted with three spatial label blocks. Thesections in row d are contacted with four spatial label blocks. Movingright to left, sections in column A can be contacted with one spatiallabel block, sections in column B can be contacted with two spatiallabel blocks. Sections in column C can be contacted with three spatiallabel blocks. The number of spatial label blocks in each section can bea representation of its location in a first dimension (y, vertical) anda second dimension (x horizontal) space within the sample.

The sections can be contacted with the spatial label blocks in anyorder. The sections can be contacted by rows only. The sections can becontacted by columns only. The sections can be contacted first by rowsand then by columns. The sections can be contacted first by columns andthen by rows.

The sections can be stochastically labeled, amplified, and/or digitallycounted. The length of the spatial label can provide information aboutthe x and y location of the section in the sample. In the embodimentshown in FIG. 7, the shortest spatial label corresponds to the topleft-most corner and the longest spatial label corresponds to the bottomright-most corner.

Methods for Determining Spatial Location of Targets

Disclosed herein are methods for identifying distinct cells in two ormore samples. In some embodiments, the methods include: stochasticallybarcoding a plurality of targets in the two or more samples using aplurality of stochastic barcodes, wherein each of the plurality ofstochastic barcodes comprises a spatial label and a molecular label;estimating the number of the plurality of targets in the two or moresamples using the molecular label; and distinguishing the two or moresamples from each other using the spatial label, wherein the pluralityof targets associated with stochastic barcodes with different spatiallabels are from different samples.

Stochastically barcoding the plurality of targets in the two or moresamples can include hybridizing the plurality of stochastic barcodeswith the plurality of targets to generate stochastically barcodedtargets, and at least one of the plurality of targets can be hybridizedto one of the plurality of stochastic barcodes. Stochastically barcodingthe plurality of targets in the two or more samples can includegenerating an indexed library of the stochastically barcoded targets.

Each of the two or more samples can include a plurality of cells and theplurality of targets are associated with the plurality of cells.Stochastically barcoding the plurality of targets in the two or moresamples can be performed with a solid support comprising a plurality ofsynthetic particles associated with the plurality of stochasticbarcodes.

Identification of Specific Cells in a Population of Cells

Spatial labels can be used to identify and label distinct samples (e.g.,cells) in a mixed population of samples (e.g., cells). The samples canbe, for example, cells in a mixed population of cells.

FIG. 10 illustrates an exemplary embodiment of the method of identifyingdistinct cells with a spatial label. A sample, for example, comprising amixed population of cells 1015/1020 can be contacted to a substrate10905, wherein the substrate 1005 comprises a distribution of differentgroups of spatial labels 1010/1011/1012. Targets from an individual cell1015 can be physically close to a first group of same spatial labels1010. Targets from a different individual cell 1020 can be physicallymore distant from the first group of spatial labels 1010, but can beclose in physical space to other spatial labels 1012 (e.g., a secondgroup of spatial labels). The cells can be lysed, stochasticallylabeled, amplified, and/or digitally counted. The spatial label can thenbe used as a code to distinguish between targets from differentindividual cells.

The targets can be associated with the closest spatial labels. Spatiallabels can be any spatial labels of the disclosure (e.g., pre-spatiallabels, spatial labels). The targets can be associated with spatiallabels that are at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30,35, 40, 45, or 50 more micrometers from the outer edge of the sample(e.g., cell). The targets can be associated with spatial labels that areat most 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, or 50more micrometers from the outer edge of the sample (e.g., cell).

Identification of Spatial Location of Targets in a Sample

The disclosure provides for methods for determining the subcellularlocation of targets in a cell. FIG. 11 illustrates how subcellularinformation can be obtained using spatial labels. A sample (e.g., acell) 1105 can be contacted to a substrate 1009 comprising one or moregroups of spatial labels 1110/1111/1112/1113. The groups of spatiallabels 1110/1111/1112/1113 can be distributed over the surface of asubstrate. The groups of spatial labels 1110/1111/1112/1113 can bedistributed over containers (e.g., microwells) of the substrate. Thegroups of spatial labels 1110/1111/1112/1113 can be distributed into thesample 1105. The groups of spatial labels 1110/1111/1112/1113 can bearranged such that the sample (e.g., a cell) 1105 contacts multipledistinct groups of spatial labels 1110/1111/1112/1113. The sample 1005can be crosslinked, physically separated, lysed, stochastically labeledwith the distinct groups of spatial labels 1110/1111/1112/1113,amplified, and/or digitally counted. Because the location of thedistinct groups of spatial labels 1110/1111/1112/1113 can be known, thelocation of the targets in the cell can be correlated to theidentification of the spatial labels 1110/1111/1112/1113. In this way,spatial labels can be used to identify the spatial location of targetsin a sample.

Methods for Optical Barcoding and Optical Barcoding

Disclosed herein are methods for determining spatial locations of aplurality of singles cells. In some embodiments, the methods include:stochastically barcoding the plurality of singe cells using a pluralityof synthetic particles, wherein each of the plurality of syntheticparticles comprises a plurality of stochastic barcodes, a first group ofoptical labels, and a second group of optical labels, wherein each ofthe plurality of stochastic barcodes comprises a cellular label and amolecular label, wherein each optical label in the first group ofoptical labels comprises a first optical moiety and each optical labelin the second group of optical labels comprises a second optical moiety,and wherein each of the plurality of synthetic particles is associatedwith an optical barcode comprising the first optical moiety and thesecond optical moiety; detecting the optical barcode of each of theplurality of synthetic particles to determine the location of each ofthe plurality of synthetic particles; and determining the spatiallocations of the plurality of single cells based on the locations of theplurality of synthetic particles.

Synthetic Particles with Stochastic Barcodes and Optical Barcodes

Disclosed herein are synthetic particles (for examples beads andmagnetic beads) associated with (e.g., attached with) stochasticbarcodes and optical labels. For example, a synthetic particle can haveone or more optical label regions in which the optical labels areassociated with the synthetic particle. In some embodiments, eachsynthetic particle can have, or have about, 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or anumber or a range between any two of these values, optical labelregions. The size of the optical label region can vary, for example, anoptical label region can be, or be about, a few microns to tens ofmicrons in width, length, or diameter. In some embodiments, the width,length, or diameter of the optical label region can be, or be about, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30,40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000,8000, 9000, 10000 microns, or a number or a range between any two ofthese values. In some embodiments, the length of the optical labelregion can be, or be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000,3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000 microns, or a number ora range between any two of these values. For a synthetic particle withmore than one optical label regions, each of the optical label regionscan have the same size or different sizes. For example, at least 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000,2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or arange between any two of these values, of the optical label regions canhave different sizes. For example, at least 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000,6000, 7000, 8000, 9000, 10000, or a number or a range between any two ofthese values, of the optical label regions can have the same size.

The arrangement of the optical label regions can vary. Non-limitingexamples of the arrangement of the optical label regions include alongitudinal format, a vertical format, a grid manner, a circularformat, or any combination thereof. The shape of the optical labelregions can also vary. For example, the optical label regions can beoval-, rectangle-, triangle-, diamond-shaped, or any combinationthereof. The optical label regions can be grouped together or beseparated from one another. For example, two optical label regions canbe separated from one another by, or by about, 1, 2, 3, 4, 5, 6, 7, 8,9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80,90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000microns, or a number or a range between any two of these values.

The optical label regions can occupy substantially the entire syntheticparticle surface, or part of the synthetic particle surface. In someembodiments, the optical label regions can occupy, or occupy about,0.00001%, 0.0001%, 0.01%, 0.1%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%,20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99.9%, or a number or a rangebetween any two of these values, of the synthetic particle surface.Optical label regions can include optical labels. In some embodiments,an optical label region can have an optical label (OL) attached to thesurface of the synthetic particle. The number of optical labels in eachof the optical label region can vary, for example, be or be about, 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40,50, 60, 70, 80, 90, 100, 200, 300, 400, 500, 600, 700, 800, 900, 1000.In some embodiments, an optical label comprises a probe sequence.

In some embodiments, each synthetic particle can include 9 types ofoptical labels, OL1-9, attached to the surface of the syntheticparticle. In some embodiments, each synthetic particle can include 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40,50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000,9000, 10000, or a number or a range between any two of these numbers,types of optical labels. For each synthetic particle, OL1-9 can be thesame or different. In some embodiments, each type of optical labels isattached to the synthetic particle in one optical label region. In someembodiments, at least one of the optical label regions on the syntheticparticle comprises more than one type of optical labels. In someembodiments, two, three, four, five, or more types of optical labels arepresent in one optical label region.

An optical label can comprise an oligonucleotide sequence. The opticallabel can comprise an oligonucleotide. In some embodiments, the opticallabel can comprise two or more oligonucleotides with the same sequence.The optical label can be, or be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000,7000, 8000, 9000, 10000, or a number or a range between any two of thesevalues, nucleotides in length. The oligonucleotides of optical labelscan be, or be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60,70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000,10000, or a number or a range between any two of these values,nucleotides in length.

In some embodiments, OL1-3 can be used to encode cellular label part 1corresponding to a first 96 unique cellular labels in the first encodingstep; OS4-6 can be used to encode cellular label part 2 corresponding toa second 96 unique cellular labels in the second split step; and OL7-9can be used to encode cellular label part 3 corresponding to a third 96unique cellular labels in the third split step. In some embodiments, OLsencode 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, or a number between any two of these values cellular label parts.In some embodiments, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60,70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000,10000, or a number between any two of these values, optical labels canbe used to encode a part of a cellular label. In some embodiments, eachpart of a cellular label can represent 1, 2, 3, 4, 5, 6, 7, 8, 9, 10,20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 10000, 100000, 1000000,10000000, 10000000, 100000000, 1000000000, or a number or a rangebetween any two of these values, unique cellular labels. An opticalbarcode of a synthetic particle can include the optical labels on thesynthetic particle. The optical barcode of a synthetic particle caninclude 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or anumber between any two of these values, optical labels.

In some embodiments, an optical label can comprise an optical moiety,for example a fluorophore or a chromophore. In some embodiments, eachnucleotide of an optical label can be associated with an optical moiety,for example a fluorophore or a chromophore, on the 3′ end. The opticalmoiety can be selected from a group of spectrally-distinct opticalmoieties. Spectrally-distinct optical moieties include optical moietieswith distinguishable emission spectra even if their emission spectralmay overlap.

Non-limiting examples of optical moieties include Xanthene derivatives:fluorescein, rhodamine, Oregon green, eosin, and Texas red; Cyaninederivatives: cyanine, indocarbocyanine, oxacarbocyanine,thiacarbocyanine, and merocyanine; Squaraine derivatives andring-substituted squaraines, including Seta, SeTau, and Square dyes;Naphthalene derivatives (dansyl and prodan derivatives); Coumarinderivatives; oxadiazole derivatives: pyridyloxazole, nitrobenzoxadiazoleand benzoxadiazole; Anthracene derivatives: anthraquinones, includingDRAQ5, DRAQ7 and CyTRAK Orange; Pyrene derivatives: cascade blue;Oxazine derivatives: Nile red, Nile blue, cresyl violet, oxazine 170;Acridine derivatives: proflavin, acridine orange, acridine yellow;Arylmethine derivatives: auramine, crystal violet, malachite green; andTetrapyrrole derivatives: porphin, phthalocyanine, bilirubin. Othernon-limiting examples of optical moieties include Hydroxycoumarin,Aminocoumarin, Methoxycoumarin, Cascade Blue, Pacific Blue, PacificOrange, Lucifer yellow, NBD, R-Phycoerythrin (PE), PE-Cy5 conjugates,PE-Cy7 conjugates, Red 613, PerCP, TruRed, FluorX, Fluorescein,BODIPY-FL, Cy2, Cy3, Cy3B, Cy3.5, Cy5, Cy5.5, Cy7, TRITC, X-Rhodamine,Lissamine Rhodamine B, Texas Red, Allophycocyanin (APC), APC-Cy7conjugates, Hoechst 33342, DAPI, Hoechst 33258, SYTOX Blue, ChromomycinA3, Mithramycin, YOYO-1, Ethidium Bromide, Acridine Orange, SYTOX Green,TOTO-1, TO-PRO-1, TO-PRO: Cyanine Monomer, Thiazole Orange, CyTRAKOrange, Propidium Iodide (PI), LDS 751, 7-AAD, SYTOX Orange, TOTO-3,TO-PRO-3, DRAQ5, DRAQ7, Indo-1, Fluo-3, Fluo-4, DCFH, DHR, and SNARF.

The excitation wavelength of the optical moieties can vary, for examplebe, or be about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130,140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270,280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550,560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690,700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830,840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970,980, 990, 1000 nanometers, or a number or a range between any two ofthese values. The emission wavelength of the optical moieties can alsovary, for example be, or be about, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370,380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510,520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650,660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790,800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930,940, 950, 960, 970, 980, 990, 1000 nanometers, or a number or a rangebetween any two of these values.

The molecular weights of the optical moieties can vary, for example be,or be about, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130,140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270,280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550,560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690,700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830,840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970,980, 990, 1000 Daltons (Da), or a number or a range between any two ofthese values. The molecular weights of the optical moieties can alsovary, for example be, or be about, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230,240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370,380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510,520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650,660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790,800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930,940, 950, 960, 970, 980, 990, 1000 kilo Daltons (kDa), or a number or arange between any two of these values.

The group of spectrally distinct optical moieties can, for example,include five different fluorophores, five different chromophores, acombination of five fluorophores and chromophores, a combination of fourdifferent fluorophores and a non-fluorophore, a combination of fourchromophores and a non-chromophore, or a combination of fourfluorophores and chromophores and a non-fluorophore non-chromophore. Insome embodiments, the optical moieties can be one of 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000,4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these values, of spectrally-distinct moieties.

In some embodiments, each of a plurality of synthetic particles has aunique optical barcode. For example, the plurality of syntheticparticles can include, include about, or include more than 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 100, 200, 300, 400, 500, 600, 700, 800, 900, 10³,10⁴, 10⁵, 10⁶, 10⁷, 10⁸, 10⁹, 10¹⁰, 10¹¹, 10¹², or a number or a rangebetween any two of these values, synthetic particles each with a uniqueoptical barcode. Some of a plurality of synthetic particles can have thesame optical barcode. The plurality of synthetic particles can include,include about, or include more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 100,200, 300, 400, 500, 600, 700, 800, 900, 10³, 10⁴, 10⁵, 10⁶, 10⁷, 10⁸,10⁹, 10¹⁰, 10¹¹, 10¹², or a number or a range between any two of thesevalues, synthetic particles some of which with the same opticalbarcodes.

In addition to the “optical labels,” substantially entire syntheticparticle surface or some part of the synthetic particle surface can beattached with stochastic barcodes. For example, the stochastic barcodescan occupy 0.00001%, 0.0001%, 0.01%, 0.1%, 1%, 2%, 3%, 4%, 5%, 6%, 7%,8%, 9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99.9%, or a numberor a range between any two of these values, of the synthetic particlesurface. The stochastic barcodes can be, or be about, 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or a number or arange between any two of these values, nucleotides in length.

Methods for Loading Spatial Labels on a Substrate

Spatial labels can be pre-located on a substrate. A surface of substratecan be pre-imprinted with stochastic barcodes. In other words, thecoordinates of each stochastic barcode on the surface of the substratecan be known. Stochastic barcodes can be pre-imprinted in any geometricmanner. In some embodiments, a solid support comprising stochasticbarcodes can be pre-located on a substrate. In some embodiments, thecoordinates of the stochastic barcodes on a substrate can be unknown.The location of the stochastic barcodes can be user-generated. When thelocation of the stochastic barcodes on a substrate is unknown, thelocation of the stochastic barcodes can be decoded.

Methods for Encoding Solid Supports

Disclosed herein are methods for creating encoded solid supports, suchas encoded synthetic particles, for determining spatial locations of aplurality of singles cells. Each synthetic particle can contain 9“anchor regions.” In some embodiments, each synthetic particle cancontain, or contain about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000,3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these values, anchor regions. Each anchor region canhave a size of about a few microns to tens of microns wide. In someembodiments, each anchor region can have a size of 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70,80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000,10000, or a number or a range between any two of these values, micronsin size.

The arrangement of the anchor regions can vary. Non-limiting examples ofthe arrangement of the anchor regions include a longitudinal format, avertical format, a grid manner, a circular format, or any combinationthereof. The shape of the anchor regions can also vary. For example, theanchor regions can be oval-, rectangle-, triangle-, diamond-shaped, orany combination thereof, in shape. The anchor regions can be groupedtogether or be separated from one another. For example, two anchorregions can be separated from one another by, or by about, 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50,60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000,9000, 10000 microns, or a number or a range between any two of thesevalues.

The anchor regions can occupy substantially the entire syntheticparticle surface, or part of the synthetic particle surface In someembodiments, the anchor regions can occupy, or occupy about, 0.00001%,0.0001%, 0.01%, 0.1%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%, 9%, 10%, 20%, 30%,40%, 50%, 60%, 70%, 80%, 90%, 99.9%, or a number or a range between anytwo of these values, of the synthetic particle surface. Anchor regionscan include optical labels. In some embodiments, an anchor region canhave an optical label (OL) attached to the surface of the syntheticparticle. The number of optical labels in each of the anchor region canvary, for example, be or be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100,200, 300, 400, 500, 600, 700, 800, 900, 1000. In some embodiments, anoptical label comprises a probe sequence. Anchor regions with opticallabels attached can be referred to as optical label regions.

Each anchor region can have a unique optical label (OL) attached to thesurface of the synthetic particle. In some embodiments, each syntheticparticle can include 9 types of optical labels, OL1-9, attached to thesurface of the synthetic particle. In some embodiments, each syntheticparticle can include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000,4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these numbers, types of optical labels. For eachsynthetic particle, OL1-9 can be the same or different. In someembodiments, each type of optical labels is attached to the syntheticparticle in one anchor region. In some embodiments, at least one of theanchor regions on the synthetic particle comprises more than one type ofoptical labels. In some embodiments, two, three, four, five, or moretypes of optical labels are present in one anchor region.

An optical label can comprise an oligonucleotide sequence. The opticallabel can comprise an oligonucleotide. In some embodiments, the opticallabel can comprise two or more oligonucleotides with the same sequence.The optical label can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9,10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000,6000, 7000, 8000, 9000, 10000, or a number or a range between any two ofthese values, nucleotides in length. The oligonucleotides of opticallabels can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30,40, 50, 60, 70, 80, 90, 100, 1000, 2000, 3000, 4000, 5000, 6000, 7000,8000, 9000, 10000, or a number or a range between any two of thesevalues, nucleotides in length.

In some embodiments, OL1-3 can be used to encode cellular label part 1corresponding to a first 96 unique cellular labels in the first encodingstep; OS4-6 can be used to encode cellular label part 2 corresponding toa second 96 unique cellular labels in the second split step; and OL7-9can be used to encode cellular label part 3 corresponding to a third 96unique cellular labels in the third split step. In some embodiments, 1,2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, or anumber between any two of these values, optical labels can be used toencode a part of a cellular label. In some embodiments, each part of acellular label correspond to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40,50, 60, 70, 80, 90, 100, 1000, 10000, 100000, 1000000, 10000000,10000000, 100000000, 1000000000, or a number or a range between any twoof these values, unique cellular labels. An optical barcode of asynthetic particle can include the optical labels on the syntheticparticle. The optical barcode of a synthetic particle can include 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or anumber between any two of these values, optical labels.

In addition to the “optical label” constituting the optical barcode, theentire synthetic particle surface or part of the synthetic particlesurface can be attached with the universal sequence (US) with 3′ up,i.e. 3′ end of the oligonucleotide is not attached to the syntheticparticle. In some embodiments, the OL oligonucleotides can be 5′ up andnot 3′ ends up. If the OL oligonucleotides are 5′ up, the opticalmoieties can be added by ligation method. The universal sequence canoccupy 0.00001%, 0.0001%, 0.01%, 0.1%, 1%, 2%, 3%, 4%, 5%, 6%, 7%, 8%,9%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 99.9%, or a number or arange between any two of these values, of the synthetic particlesurface. The universal sequence can be, or can be about, 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or a numberor a range between any two of these values, nucleotides in length.

For cellular labels each comprising three parts, encoding the syntheticparticles can include three encoding steps. The cellular label caninclude part 1 of the cellular label, part 2 of the cellular label, andpart 3 of the cellular label. In some embodiments, the cellular labelcan include 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80,90, 100, 1000, or a number or a range between any two of these values,parts. Encoding the synthetic particles can include 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or a number or arange between any two of these values, encoding steps.

At the first encoding step/the first split step, synthetic particles canbe distributed across 96 wells of a first plate and hybridize tooligonucleotides in each well. In some embodiments, the first plate caninclude 96, 394, 1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000,10000, or a number between any two of these values, wells. Each well cancontain 4 types of oligonucleotides, possibly including the universalsequence (US) and three additional types of oligonucleotides. Eachadditional type of oligonucleotides can include an optical label with anoptical moiety, for example a fluorophore or a chromophore, on the 3′end. The three additional types of oligonucleotides encode a cellularlabel part. In some embodiments, each well can contain, or containabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,19, 20, 30, 40, 50, 60 70, 80, 90, 100, 1000, or a number of rangebetween any two of these values, additional types of oligonucleotides.Each type of oligonucleotides can be, or can be about, 1, 2, 3, 4, 5, 6,7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 10000, or anumber or a range between any two of these values, nucleotides inlength. Each cellular part can be, or can be about, 1, 2, 3, 4, 5, 6, 7,8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or a number or arange between any two of these values, nucleotides in length.

The first type of oligonucleotides each can contain a regioncomplementary to the universal sequence (US), followed by part 1 of thecellular label (1 of 96), followed by a linker sequence (linker 1). Theregion complementary to the universal sequence can be, or can be about,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,1000, or a number or a range between any two of these values,nucleotides in length. Linker 1 can be, or can be about, 1, 2, 3, 4, 5,6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 10000, or anumber or a range between any two of these values, nucleotides inlength.

The second type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL1 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical label can be, or can be about, 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000,3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these values, nucleotides in length. The opticalmoiety can be selected from a group of spectrally-distinct opticalmoieties. Spectrally-distinct optical moieties include optical moietieswith distinguishable emission spectra even if their emission spectralmay overlap.

The third type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL2 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

The fourth type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL3 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

After the synthetic particles are distributed across 96 wells of thefirst plate and hybridize to oligonucleotides in each well, a polymerasesuch as a DNA polymerase and a ligase such as a DNA ligase can beintroduced into each well. DNA polymerase can extend the universalsequence with cellular label part 1 and linker 1 sequences. DNA ligasecan covalently attach the optical moieties onto the OL oligonucleotides.

At the second encoding step including pool and second split, syntheticparticles from all the wells of the first plate can be pooled, and splitinto each of the 96 wells of a second plate. In some embodiments, thesecond plate can include 96, 394, 1000, 2000, 3000, 4000, 5000, 6000,7000, 8000, 9000, 10000, or a number between any two of these values,wells. Each well can contain 4 types of oligonucleotides, possiblyincluding the universal sequence (US) and three additional types ofoligonucleotides. Each additional type of oligonucleotides can includean optical label with an optical moiety, for example a fluorophore or achromophore, on the 3′ end. The three additional types ofoligonucleotides encode a cellular label part. In some embodiments, eachwell can contain, or contain about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60 70, 80, 90, 100,1000, or a number of range between any two of these values, additionaltypes of oligonucleotides. Each type of oligonucleotides can be, or canbe about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, 10000, or a number or a range between any two of thesevalues, nucleotides in length. Each cellular part can be, or can beabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, or a number or a range between any two of these values,nucleotides in length.

The first type of oligonucleotide each can include linker 1, followed bypart 2 of the cellular label (1 of 96, for example), followed by anotherlinker sequence (linker 2). Linker 2 can be, or can be about, 1, 2, 3,4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 10000,or a number or a range between any two of these values, nucleotides inlength.

The second type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL4 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical label can be, or can be about, 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000,3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these values, nucleotides in length. The opticalmoiety can be selected from a group of spectrally-distinct opticalmoieties. Spectrally-distinct optical moieties include optical moietieswith distinguishable emission spectra even if the emission spectral mayoverlap.

The third type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL5 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

The fourth type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL6 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

After the synthetic particles are distributed across 96 wells of thesecond plate and hybridize to oligonucleotides in each well, apolymerase such as a DNA polymerase and a ligase such as a DNA ligasecan be introduced into each well. DNA polymerase can extend theuniversal sequence with cellular label part 2 and linker 2 sequences.DNA ligase can covalently attach the optical moieties onto the OLoligonucleotides.

At the third encoding step including pool and third split, syntheticparticles from all the wells of the second plate can be pooled, andsplit into each of the 96 wells of a third plate. In some embodiments,the third plate can include 96, 394, 1000, 2000, 3000, 4000, 5000, 6000,7000, 8000, 9000, 10000, or a number between any two of these values,wells. Each well can contain 4 types of oligonucleotides, possiblyincluding the universal sequence (US) and three additional types ofoligonucleotides. Each additional type of oligonucleotides can includean optical label with an optical moiety, for example a fluorophore or achromophore, on the 3′ end. The three additional types ofoligonucleotides encode a cellular label part. In some embodiments, eachwell can contain, or contain about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60 70, 80, 90, 100,1000, or a number of range between any two of these values, additionaltypes of oligonucleotides. Each type of oligonucleotides can be, or canbe about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, 10000, or a number or a range between any two of thesevalues, nucleotides in length. Each cellular part can be, or can beabout, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90,100, 1000, or a number or a range between any two of these values,nucleotides in length.

The first type of oligonucleotide each can include linker 2, followed bypart 3 of the cellular label (1 of 96), followed by molecular index(randomers) and oligo(dA). The molecular index can be, or can be about,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,1000, 10000, or a number or a range between any two of these values,nucleotides in length. The oligo(dA) can be, or can be about, 1, 2, 3,4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 10000,or a number or a range between any two of these values, nucleotides inlength.

The second type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL7 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical label can be, or can be about, 1, 2, 3, 4,5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, 2000,3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a number or a rangebetween any two of these values, nucleotides in length. The opticalmoiety can be selected from a group of spectrally-distinct opticalmoieties. Spectrally-distinct optical moieties include optical moietieswith distinguishable emission spectra even if the emission spectral mayoverlap.

The third type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL8 with a small 7′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

The fourth type of oligonucleotides each can include a duplex structurethat contains a strand complementary to OL9 with a small 5′ extension,and a shorter strand complementary to the extension on the longerstrand. The shorter strand can include an optical label with an opticalmoiety, for example a fluorophore or a chromophore, on the 3′ end. Insome embodiments, the optical label is on the 5's end, or neither 5'send nor 3's end. The optical moiety can be selected from a group ofspectrally-distinct optical moieties.

After the synthetic particles are distributed across 96 wells of thethird plate and hybridize to oligonucleotides in each well, a polymerasesuch as a DNA polymerase and a ligase such as a DNA ligase can beintroduced into each well. DNA polymerase can extend the universalsequence with cellular label part 3 sequence. DNA ligase can covalentlyattach the optical moiety onto the OL oligonucleotides.

At the i^(th) encoding step including pool and second split, syntheticparticles from all the wells of the (i−1)^(th) plate can be pooled, andsplit into each of the 96 wells of a i^(th) plate. In some embodiments,the i^(th) plate can include 96, 394, 1000, 2000, 3000, 4000, 5000,6000, 7000, 8000, 9000, 10000, or a number between any two of thesevalues, wells. Each well can contain m types of oligonucleotides,possibly including the universal sequence (US) and j additional types ofoligonucleotides. Each additional type of oligonucleotides can includean optical label with an optical moiety, for example a fluorophore or achromophore, on the 3′ end. The optical moiety can be part of anoligonucleotide. The j additional types of oligonucleotides encode acellular label part. In some embodiments, each well can contain, orcontain about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,17, 18, 19, 20, 30, 40, 50, 60 70, 80, 90, 100, 1000, or a number ofrange between any two of these values, additional types ofoligonucleotides. Each type of oligonucleotides can be, or can be about,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,1000, 10000, or a number or a range between any two of these values,nucleotides in length. Each cellular part can be, or can be about, 1, 2,3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 1000, or anumber or a range between any two of these values, nucleotides inlength.

The first type of oligonucleotide each can include linker (i−1),followed by part i of the cellular label (1 of 96, for example),followed by another linker sequence (linker i). Linker i and linker(i−1) can be, or can be about, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30,40, 50, 60, 70, 80, 90, 100, 1000, 10000, or a number or a range betweenany two of these values, nucleotides in length.

Each of the j additional types of oligonucleotides can include a duplexstructure that contains a strand complementary to OLm with a small 5′extension, and a shorter strand complementary to the extension on thelonger strand. The shorter strand can include an optical label with anoptical moiety, for example a fluorophore or a chromophore, on the 3′end. In some embodiments, the optical label is on the 5's end, orneither 5's end nor 3's end. The optical label can be, or can be about,1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100,1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, or a numberor a range between any two of these values, nucleotides in length. Theoptical moiety can be selected from a group of k spectrally-distinctoptical moieties. Spectrally-distinct optical moieties include opticalmoieties with distinguishable emission spectra even if the emissionspectral may overlap.

After the synthetic particles are distributed across 96 wells of thei^(th) plate and hybridize to oligonucleotides in each well, apolymerase such as a DNA polymerase and a ligase such as a DNA ligasecan be introduced into each well. DNA polymerase can extend theuniversal sequence with cellular label part i and linker i sequences.DNA ligase can covalently attach the optical moieties onto the OLoligonucleotides. In some embodiments, optical moieties can be added toeach of the OL oligonucleotides by polymerase extension with opticalmoieties labeled nucleotides. For an optical barcode comprising noptical labels, possibly encoded by i encoding steps each with jadditional types of oligonucleotides, each selected from a group of kspectrally-distinct optical moieties, the optical barcode can representk^(n)=k^(i)*^(j) unique cellular labels.

In some embodiments, the first encoding reaction with coupling of OS1-3and 1 of the 96 cell label part 1 can be achieved by an enzymaticprocess. In some embodiments, the first encoding reaction can beincorporated in the lithography process. Instead of generating 1 type ofcore synthetic particle with lithography, 96 types of syntheticparticles can be generated by lithography. The universal sequence can bereplaced by one of the 96 ‘cell label part 1’ oligonucleotides. Thecorresponding combination of OS1, OS2, and OS3 optical moieties can beattached to the synthetic particles during the lithography process.

Synthetic Particle Synthesis

In some embodiments, the synthetic particles are generated usingphotolithography. In some embodiments, the synthetic particles aregenerated using stop flow lithography. To generate synthetic particleswith n OL oligonucleotides, for example 9 OL oligonucleotides, fabricatea microfluidic device (e.g. PDMS or NOA) with n input ports convergingto a single channel, leading to 1 output port. In each of the inputport, feed in a mixture of, for example, Poly(ethylene glycol)diacrylate PEGDA, photoinitiator, 5′ acrydite modified universalsequence (US) oligonucleotide, and 5′ acrydite modified OLoligonucleotide (OL1 oligonucleotide for input port 1, OL2oligonucleotide for input port 2, . . . , OLn oligonucleotide for inputport n). Subsequently, apply pressure at each of the input ports. The ninputs will form n parallel streams under laminar flow regime. Expose aregion of the converged channel with, for example, UV through aphotomask with the outline of the shape of the synthetic particle. UponUV exposure, PEGDA and acrydite oligonucleotides can crosslink to form asolid hydrogel synthetic particle, with n regions each with a differentOL oligonucleotide arranged side by side. The synthetic particles can becollected at the output port and used for the encoding solid supportssuch for synthetic particles.

Methods for Decoding Substrates

In some embodiments, the methods can include decoding the solid support.In some embodiments, the method can include decoding the plurality ofsynthetic particles. Decoding the plurality of synthetic particles caninclude detecting the optical barcode of the plurality of syntheticparticles. The methods can include determining the locations of theplurality of synthetic particles. Detecting the optical barcode of eachof the plurality of synthetic particles to determine the location ofeach of the plurality of synthetic particles can include generating anoptical image showing the optical barcodes and the locations of theplurality of synthetic particles.

The disclosure provides for methods for decoding substrates (e.g.,arrays) comprising stochastic barcodes. In some embodiments, the methodscomprise decoding the solid support. In some embodiments, decoding doesnot rely solely on the use of optical signatures, for example opticalbarcodes (although as described herein, the use of beads with opticalsignatures can allow the “reuse” of the decoding probes), but rather onthe use of combinatorial decoding nucleic acids that are added during adecoding step. Decoding can be performed with sequential hybridizations.The decoding nucleic acids can hybridize either to a distinct identifiercoding nucleic acid (identifier probe) that is placed on the beads, orto the bioactive agent itself, for example when the bioactive agent is anucleic acid, at least some portion of which is single stranded to allowhybridization to a decoding probe. The decoding nucleic acids can beeither directly or indirectly labeled. Decoding occurs by detecting thepresence of the label.

The coding nucleic acids (also termed identifier probes (IP) oridentifier nucleic acids) can comprise a primer sequence and an adjacentdecoding sequence. Each decoder (or decoding) probe can comprise apriming sequence (sometimes referred to herein as an “invariantsequence”), that can hybridize to the primer sequence, and at least onedecoding nucleotide, generally contained within a variable sequence. Thedecoder probes can be made as sets, with each set comprising at leastfour subsets that each have a different decoding nucleotide at the sameposition i.e. the detection position, (i.e. adenine, thymidine (oruracil, as desired), cytosine and guanine), with each nucleotide at thedetection position (detection nucleotide) comprising a unique label,preferably a fluorophore. The decoder probes can be added underconditions that allow discrimination of perfect complementarity andimperfect complementarity. Thus, the decoding probe that comprises thecorrect base for basepairing with the coding nucleotide beinginterrogated can hybridize the best. The other decoding probes can bewashed away. The detection of the unique fluorophore associated with thedetection nucleotide can allow for the identification of the codingnucleotide at that position. By repeating these steps with a new set ofdecoding probes that extends the position of the detection nucleotide byone base, the identity of next coding nucleotide can be elucidated.Decoding can use a large number of probes. Split and mix combinatorialsynthesis can be used to prepare the decoding probes.

Parity analysis can be used during decoding to increase the robustnessand accuracy of the system. Parity analysis can refer to a decoding stepwherein the signal of a particular element can be analyzed across aplurality of decoding stages. That is, following at least one decodingstep, the signal of an array element across the decoding stages can beanalyzed. The signal from a particular bead can be evaluated acrossmultiple stages. Although the analysis can include any parameter thatcan be obtained from the signals, such as evaluating the total signalobtained across the stages, the parity of the signals across the stagescan be analyzed.

Parity can refer to the digital or modular readout of signals, i.e. oddor even, when binary signals are used. The digit sum of the signalsacross a plurality of stages can be translated into a paritydetermination. The parity determination can be useful in evaluating thedecoding process. For example, codes can be designed to have an oddnumber of a particular signal, for example a red signal, when viewedacross all stages or decoding steps, or a pre-determined plurality ofstages or decoding steps. The detection of an even number of red stagescan provide an indication that an error has occurred at some point indecoding. When this result is obtained, the faulty code can either bediscarded, or the analysis repeated.

The disclosure provides for introducing a “redundant stage” into thedecoding system. A redundant stage can refer to a stage that serves as aparity check. That is, following the decoding stages, an additionalstage can be included to analyze the parity. This analysis can providean indication of the competence or validity of the decoding. When codesare designed with a pre-determined parity, the redundant stage can beused to detect the parity of the signals obtained from the decodingstep. The redundant stage can detect errors in parity because if therehas been an error in decoding, the parity detected following theredundant stage will be different from the parity designed into thecodes.

In some embodiments, decoding can occur through the use of 8-meroligonucleotides strung together to create a decoding oligonucleotidewith a few 8-mers on it. The decoding oligonucleotide can comprise atleast 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more 8-mers on it. Thedecoding oligonucleotide can comprise at most 1, 2, 3, 4, 5, 6, 7, 8, 9,or 10 or more 8-mers on it. The decoding oligonucleotide can hybridizeto a few different stochastic barcodes (e.g., by hybridizing itsdifferent 8-mer regions to different 8-mer regions on stochasticbarcodes). The decoding oligonucleotide can be fluorescently labeled,melted off, and sequenced. The decoding oligonucleotides can befluorescently labeled in different colors. The decoding oligonucleotidescan be fluorescently labeled with the same color but with various levelsof fluorescent intensity, thereby generating a “gray-scale” map of aprobe. Repeating this can provide a solvable map of where each 8-mer ofa stochastic barcode is in relation to each other. The method can berepeated at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more times. Themethod can be repeated at most 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or moretimes.

In some embodiments, decoding can be performed by sequencing bysynthesis (e.g., using 454 and/or ion torrent sequencing). Decoding canbe performed by imaging of optically encoded beads. For example, beadscan be encoded with quantum dots or fluorophores which can be embeddedin the beads. The quantum dots or fluorophores can be used in thedecoding process. In some embodiments, the optically encoded beads cancomprise a dye. The dye can be used to distinguish beads with differentstochastic barcodes. Decoding can occur with the use of physicallyencoded solid supports (e.g., beads). For example, a bead can bepatterned or engraved with an identifier. The identifier can be etchedinto the bead with a laser or with lithography methods. In someembodiments, the beads can be physically encoded based on size and/orshape. Decoding can occur with electronically encoded beads. Decodingcan use an electronic readout to read the electronic identifier in thebeads. An electronic identifier can include, for example, an RFID tag,an electrical resistance, and/or an electrical capacitance.

Diffusion Across a Substrate

When a sample (e.g., cell) is stochastically barcoded according to themethods of the disclosure, the cell can be lysed. In some embodiments,lysis of a cell can result in the diffusion of the contents of the lysis(e.g., cell contents) away from the initial location of lysis. In otherwords, the lysis contents can move into a larger surface area than thesurface area taken up by the cell.

Diffusion of sample lysis mixture (e.g., comprising targets) can bemodulated by various parameters including, but not limited to, viscosityof the lysis mixture, temperature of the lysis mixture, the size of thetargets, the size of physical barriers in a substrate, the concentrationof the lysis mixture, and the like. For example, the temperature of thelysis reaction can be performed at a temperature of at least 1, 2, 3, 4,5, 10, 15, 20, 25, 30, 35, or 40 C or more. The temperature of the lysisreaction can be performed at a temperature of at most 1, 2, 3, 4, 5, 10,15, 20, 25, 30, 35, or 40 C or more. The viscosity of the lysis mixturecan be altered by, for example, adding thickening reagents (e.g.,glycerol, beads) to slow the rate of diffusion. The viscosity of thelysis mixture can be altered by, for example, adding thinning reagents(e.g., water) to increase the rate of diffusion. A substrate cancomprise physical barriers (e.g., wells, microwells, microhills) thatcan alter the rate of diffusion of targets from a sample. Theconcentration of the lysis mixture can be altered to increase ordecrease the rate of diffusion of targets from a sample. Theconcentration of a lysis mixture can be increased or decreased by atleast 1, 2, 3, 4, 5, 6, 7, 8, or 9 or more fold. The concentration of alysis mixture can be increased or decreased by at most 1, 2, 3, 4, 5, 6,7, 8, or 9 or more fold.

The rate of diffusion can be increased. The rate of diffusion can bedecreased. The rate of diffusion of a lysis mixture can be increased ordecreased by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more foldcompared to an un-altered lysis mixture. The rate of diffusion of alysis mixture can be increased or decreased by at most 1, 2, 3, 4, 5, 6,7, 8, 9, or 10 or more fold compared to an un-altered lysis mixture. Therate of diffusion of a lysis mixture can be increased or decreased by atleast 10, 20, 30, 40, 50, 60, 70, 80, 90 or 100% compared to anun-altered lysis mixture. The rate of diffusion of a lysis mixture canbe increased or decreased by at least 10, 20, 30, 40, 50, 60, 70, 80, 90or 100% compared to an un-altered lysis mixture.

Sample Imaging

The disclosure provides for compositions, methods, kits, and systems foridentifying the spatial location of nucleic acids in a target from asample. FIG. 12 illustrates an exemplary embodiment of the homopolymertailing method of the disclosure The disclosure provides for a substrate1210 comprising a plurality of probes 1205 attached to the surface ofthe substrate. The substrate 1210 can be a microarray. The plurality ofprobes 1205 can comprise an oligo(dT). The plurality of probes 1205 cancomprise a gene-specific sequence. The plurality of probes 105 cancomprise a stochastic barcode. A sample (e.g., cells) 1215 can be placedand/or grown on the substrate 1210. The substrate comprising the samplecan be analyzed 1220, for example by imaging and/orimmunohistochemistry. The sample 1215 can be lysed 1225 on the substrate1210. The nucleic acids 1230 from the sample 1215 can associate (e.g.,hybridize) with the plurality of probes 1205 on the substrate 1210. Insome embodiments, the nucleic acids 1230 can be reverse transcribed,homopolymer tailed, and/or amplified (e.g., with bridge amplification).The amplified nucleic acids can be interrogated 1235 with detectionprobes 1240 (e.g., fluorescent probes). The detection probes 1240 can begene-specific probes. The location of binding of the detection probes1240 on the substrate 1210 can be correlated with the image of thesubstrate, thereby producing a map that indicates the spatial locationof nucleic acids in the sample.

In some embodiments, the methods of the disclosure can comprise making1245 a replicate 1246 of the original substrate 1210. The replicatesubstrate 1246 can comprise a plurality of probes 1231. The plurality ofprobes 1231 can be the same as the plurality of probes 1205 on theoriginal substrate 1210. The plurality of probes 1231 can be differentthan the plurality of probes 1205 on the original substrate 1210. Forexample, the plurality of probes 1205 can be oligo(dT) probes and theplurality of probes 1231 on the replicate substrate 1246 can begene-specific probes. The replicate substrate can be processed like theoriginal substrate, such as with interrogation by detection (e.g.,fluorescent) probes.

Data Analysis and Display Software

Data Analysis and Visualization of Spatial Resolution of Targets

The disclosure provides for methods for estimating the number andposition of targets with stochastic barcoding and digital counting usingspatial labels. The data obtained from the methods of the disclosure canbe visualized on a map. A map of the number and location of targets froma sample can be constructed using information generated using themethods described herein. The map can be used to locate a physicallocation of a target. The map can be used to identify the location ofmultiple targets. The multiple targets can be the same species oftarget, or the multiple targets can be multiple different targets. Forexample a map of a brain can be constructed to show the digital countand location of multiple targets.

The map can be generated from data from a single sample. The map can beconstructed using data from multiple samples, thereby generating acombined map. The map can be constructed with data from tens, hundreds,and/or thousands of samples. A map constructed from multiple samples canshow a distribution of digital counts of targets associated with regionscommon to the multiple samples. For example, replicated assays can bedisplayed on the same map. At least 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 ormore replicates can be displayed (e.g., overlaid) on the same map. Atmost 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 or more replicates can bedisplayed (e.g., overlaid) on the same map. The spatial distribution andnumber of targets can be represented by a variety of statistics.

Combining data from multiple samples can increase the locationalresolution of the combined map. The orientation of multiple samples canbe registered by common landmarks, wherein the individual locationalmeasurements across samples are at least in part non-contiguous. Aparticular example is sectioning a sample using a microtome on one axisand then sectioning a second sample along a different access. Thecombined dataset will give three dimensional spatial locationsassociated with digital counts of targets. Multiplexing the aboveapproach will allow for high resolution three dimensional maps ofdigital counting statistics.

In some embodiments of the instrument system, the system will comprisecomputer-readable media that includes code for providing data analysisfor the sequence datasets generated by performing single cell,stochastic barcoding assays. Examples of data analysis functionalitythat can be provided by the data analysis software include, but are notlimited to, (i) algorithms for decoding/demultiplexing of the samplelabel, cellular label, spatial label, and molecular label, and targetsequence data provided by sequencing the stochastic barcode librarycreated in running the assay, (ii) algorithms for determining the numberof reads per gene per cell, and the number of unique transcriptmolecules per gene per cell, based on the data, and creating summarytables, (iii) statistical analysis of the sequence data, e.g. forclustering of cells by gene expression data, or for predictingconfidence intervals for determinations of the number of transcriptmolecules per gene per cell, etc., (iv) algorithms for identifyingsub-populations of rare cells, for example, using principal componentanalysis, hierarchical clustering, k-mean clustering, self-organizingmaps, neural networks etc., (v) sequence alignment capabilities foralignment of gene sequence data with known reference sequences anddetection of mutation, polymorphic markers and splice variants, and (vi)automated clustering of molecular labels to compensate for amplificationor sequencing errors. In some embodiments, commercially-availablesoftware can be used to perform all or a portion of the data analysis,for example, the Seven Bridges (https://www.sbgenomics.com/) softwarecan be used to compile tables of the number of copies of one or moregenes occurring in each cell for the entire collection of cells. In someembodiments, the data analysis software can include options foroutputting the sequencing results in useful graphical formats, e.g.heatmaps that indicate the number of copies of one or more genesoccurring in each cell of a collection of cells. In some embodiments,the data analysis software can further comprise algorithms forextracting biological meaning from the sequencing results, for example,by correlating the number of copies of one or more genes occurring ineach cell of a collection of cells with a type of cell, a type of rarecell, or a cell derived from a subject having a specific disease orcondition. In some embodiment, the data analysis software can furthercomprise algorithms for comparing populations of cells across differentbiological samples.

In some embodiments all of the data analysis functionality can bepackaged within a single software package. In some embodiments, thecomplete set of data analysis capabilities can comprise a suite ofsoftware packages. In some embodiments, the data analysis software canbe a standalone package that is made available to users independently ofthe assay instrument system. In some embodiments, the software can beweb-based, and can allow users to share data.

In some embodiments all of the data analysis functionality can bepackaged within a single software package. In some embodiments, thecomplete set of data analysis capabilities can comprise a suite ofsoftware packages. In some embodiments, the data analysis software canbe a standalone package that is made available to users independently ofthe assay instrument system. In some embodiments, the software can beweb-based, and can allow users to share data.

System Processors and Networks

In general, the computer or processor included in the presentlydisclosed instrument systems, as illustrated in FIG. 13, can be furtherunderstood as a logical apparatus that can read instructions from media1311 or a network port 1305, which can optionally be connected to server1309 having fixed media 1312. The system 1300, such as shown in FIG. 13can include a CPU 1301, disk drives 1303, optional input devices such askeyboard 1315 or mouse 1316 and optional monitor 1307. Datacommunication can be achieved through the indicated communication mediumto a server at a local or a remote location. The communication mediumcan include any means of transmitting or receiving data. For example,the communication medium can be a network connection, a wirelessconnection or an internet connection. Such a connection can provide forcommunication over the World Wide Web. It is envisioned that datarelating to the present disclosure can be transmitted over such networksor connections for reception or review by a party 1322 as illustrated inFIG. 13.

FIG. 14 illustrates an exemplary embodiment of a first examplearchitecture of a computer system 1400 that can be used in connectionwith example embodiments of the present disclosure. As depicted in FIG.14, the example computer system can include a processor 1402 forprocessing instructions. Non-limiting examples of processors include:Intel Xeon™ processor, AMD Opteron™ processor, Samsung 32-bit RISC ARM1176JZ(F)-S v1.0™ processor, ARM Cortex-A8 Samsung S5PC100™ processor,ARM Cortex-A8 Apple A4™ processor, Marvell PXA 930™ processor, or afunctionally-equivalent processor. Multiple threads of execution can beused for parallel processing. In some embodiments, multiple processorsor processors with multiple cores can also be used, whether in a singlecomputer system, in a cluster, or distributed across systems over anetwork comprising a plurality of computers, cell phones, or personaldata assistant devices.

As illustrated in FIG. 14, a high speed cache 1404 can be connected to,or incorporated in, the processor 1402 to provide a high speed memoryfor instructions or data that have been recently, or are frequently,used by processor 1402. The processor 1402 is connected to a northbridge 1406 by a processor bus 1408. The north bridge 1406 is connectedto random access memory (RAM) 1410 by a memory bus 1412 and managesaccess to the RAM 1410 by the processor 1402. The north bridge 1406 isalso connected to a south bridge 1414 by a chipset bus 1416. The southbridge 1414 is, in turn, connected to a peripheral bus 1418. Theperipheral bus can be, for example, PCI, PCI-X, PCI Express, or otherperipheral bus. The north bridge and south bridge are often referred toas a processor chip set and manage data transfer between the processor,RAM, and peripheral components on the peripheral bus 118. In somealternative architectures, the functionality of the north bridge can beincorporated into the processor instead of using a separate north bridgechip.

In some embodiments, system 1400 can include an accelerator card 1422attached to the peripheral bus 1418. The accelerator can include fieldprogrammable gate arrays (FPGAs) or other hardware for acceleratingcertain processing. For example, an accelerator can be used for adaptivedata restructuring or to evaluate algebraic expressions used in extendedset processing.

Software and data are stored in external storage 1424 and can be loadedinto RAM 1410 or cache 1404 for use by the processor. The system 1400includes an operating system for managing system resources; non-limitingexamples of operating systems include: Linux, Windows™, MACOS™,BlackBerry OS™, iOS™, and other functionally-equivalent operatingsystems, as well as application software running on top of the operatingsystem for managing data storage and optimization in accordance withexample embodiments of the present invention.

In this example, system 1400 also includes network interface cards(NICs) 1420 and 1421 connected to the peripheral bus for providingnetwork interfaces to external storage, such as Network Attached Storage(NAS) and other computer systems that can be used for distributedparallel processing.

FIG. 15 illustrates an exemplary diagram showing a network 1500 with aplurality of computer systems 1502a, and 1502b, a plurality of cellphones and personal data assistants 1502c, and Network Attached Storage(NAS) 1504a, and 1504b. In example embodiments, systems 1512a, 1512b,and 1512c can manage data storage and optimize data access for datastored in Network Attached Storage (NAS) 1514a and 1514b. A mathematicalmodel can be used for the data and be evaluated using distributedparallel processing across computer systems 1512a, and 1512b, and cellphone and personal data assistant systems 1512c. Computer systems 1512a,and 1512b, and cell phone and personal data assistant systems 1512c canalso provide parallel processing for adaptive data restructuring of thedata stored in Network Attached Storage (NAS) 1514a and 1514b. FIG. 15illustrates an example only, and a wide variety of other computerarchitectures and systems can be used in conjunction with the variousembodiments of the present invention. For example, a blade server can beused to provide parallel processing. Processor blades can be connectedthrough a back plane to provide parallel processing. Storage can also beconnected to the back plane or as Network Attached Storage (NAS) througha separate network interface.

In some example embodiments, processors can maintain separate memoryspaces and transmit data through network interfaces, back plane or otherconnectors for parallel processing by other processors. In otherembodiments, some or all of the processors can use a shared virtualaddress memory space.

FIG. 16 illustrates an exemplary a block diagram of a multiprocessorcomputer system 1600 using a shared virtual address memory space inaccordance with an example embodiment. The system includes a pluralityof processors 1602a-f that can access a shared memory subsystem 1604.The system incorporates a plurality of programmable hardware memoryalgorithm processors (MAPs) 1606a-f in the memory subsystem 1604. EachMAP 1606a-f can comprise a memory 1608a-f and one or more fieldprogrammable gate arrays (FPGAs) 1610a-f. The MAP provides aconfigurable functional unit and particular algorithms or portions ofalgorithms can be provided to the FPGAs 1610a-f for processing in closecoordination with a respective processor. For example, the MAPs can beused to evaluate algebraic expressions regarding the data model and toperform adaptive data restructuring in example embodiments. In thisexample, each MAP is globally accessible by all of the processors forthese purposes. In one configuration, each MAP can use Direct MemoryAccess (DMA) to access an associated memory 308a-f, allowing it toexecute tasks independently of, and asynchronously from, the respectivemicroprocessor 302a-f. In this configuration, a MAP can feed resultsdirectly to another MAP for pipelining and parallel execution ofalgorithms.

The above computer architectures and systems are examples only, and awide variety of other computer, cell phone, and personal data assistantarchitectures and systems can be used in connection with exampleembodiments, including systems using any combination of generalprocessors, co-processors, FPGAs and other programmable logic devices,system on chips (SOLs), application specific integrated circuits(ASICs), and other processing and logic elements. In some embodiments,all or part of the computer system can be implemented in software orhardware. Any variety of data storage media can be used in connectionwith example embodiments, including random access memory, hard drives,flash memory, tape drives, disk arrays, Network Attached Storage (NAS)and other local or distributed data storage devices and systems.

In example embodiments, the computer subsystem of the present disclosurecan be implemented using software modules executing on any of the aboveor other computer architectures and systems. In other embodiments, thefunctions of the system can be implemented partially or completely infirmware, programmable logic devices such as field programmable gatearrays (FPGAs), system on chips (SOLs), application specific integratedcircuits (ASICs), or other processing and logic elements. For example,the Set Processor and Optimizer can be implemented with hardwareacceleration through the use of a hardware accelerator card, such asaccelerator card.

Kits

Disclosed herein are kits for performing single cell, stochasticbarcoding assays. The kit can comprise one or more substrates (e.g.,microwell array), either as a free-standing substrate (or chip)comprising one or more microwell arrays, or packaged within one or moreflow-cells or cartridges, and one or more solid support suspensions,wherein the individual solid supports within a suspension comprise aplurality of attached stochastic barcodes of the disclosure. In someembodiments, the kit can further comprise a mechanical fixture formounting a free-standing substrate in order to create reaction wellsthat facilitate the pipetting of samples and reagents into thesubstrate. The kit can further comprise reagents, e.g. lysis buffers,rinse buffers, or hybridization buffers, for performing the stochasticbarcoding assay. The kit can further comprise reagents (e.g. enzymes,primers, or buffers) for performing nucleic acid extension reactions,for example, reverse transcription reactions. The kit can furthercomprise reagents (e.g. enzymes, universal primers, sequencing primers,target-specific primers, or buffers) for performing amplificationreactions to prepare sequencing libraries. The kit can comprise reagentsfor performing the label lithography method of the disclosure (e.g.,pre-spatial labels and reagents for activating the activatable consensussequence).

The kit can comprise one or more molds, for example, molds comprising anarray of micropillars, for casting substrates (e.g., microwell arrays),and one or more solid supports (e.g., bead), wherein the individualbeads within a suspension comprise a plurality of attached stochasticbarcodes of the disclosure. The kit can further comprise a material foruse in casting substrates (e.g. agarose, a hydrogel, PDMS, and thelike).

The kit can comprise one or more substrates that are pre-loaded withsolid supports comprising a plurality of attached stochastic barcodes ofthe disclosure. In some embodiments, there can be on solid support permicrowell of the substrate. In some embodiments, the plurality ofstochastic barcodes can be attached directly to a surface of thesubstrate, rather than to a solid support. In any of these embodiments,the one or more microwell arrays can be provided in the form offree-standing substrates (or chips), or they can be packed in flow-cellsor cartridges.

In some embodiments of the disclosed kits, the kit can comprise one ormore cartridges that incorporate one or more substrates. In someembodiments, the one or more cartridges can further comprise one or morepre-loaded solid supports, wherein the individual solid supports withina suspension comprise a plurality of attached stochastic barcodes of thedisclosure. In some embodiments, the beads can be pre-distributed intothe one or more microwell arrays of the cartridge. In some embodiments,the beads, in the form of suspensions, can be pre-loaded and storedwithin reagent wells of the cartridge. In some embodiments, the one ormore cartridges can further comprise other assay reagents that arepre-loaded and stored within reagent reservoirs of the cartridges.

Disclosed herein are kits for performing spatial analysis of nucleicacids in a sample. The kit can comprise one or more substrates (e.g.,array) of the disclosure, either as a free-standing substrate (or chip)comprising one or more arrays. The array can comprise probes of thedisclosure. The kit can comprise one or more replicate arrays of thedisclosure. The replicate arrays can comprise either gene-specific oroligo(dT)/poly(A) probes.

The kit can further comprise reagents, e.g. lysis buffers, rinsebuffers, or hybridization buffers, for performing the assay. The kit canfurther comprise reagents (e.g. enzymes, primers, dNTPs, NTPs, RNaseinhibitors, or buffers) for performing nucleic acid extension reactions,for example, reverse transcription reactions and primer extensionreactions. The kit can further comprise reagents (e.g. enzymes,universal primers, sequencing primers, target-specific primers, orbuffers) for performing amplification reactions to prepare sequencinglibraries. The kit can comprise reagents for homopolymer tailing ofmolecules (e.g., a terminal transferase enzyme, and dNTPs). The kit cancomprise reagents for, for example, any enzymatic cleavage of thedisclosure (e.g., Exol nuclease, restriction enzyme).

Kits can generally include instructions for carrying out one or more ofthe methods described herein. Instructions included in kits can beaffixed to packaging material or can be included as a package insert.While the instructions are typically written or printed materials theyare not limited to such. Any medium capable of storing such instructionsand communicating them to an end user is contemplated by the disclosure.Such media can include, but are not limited to, electronic storage media(e.g., magnetic discs, tapes, cartridges, chips), optical media (e.g.,CD ROM), RF tags, and the like. As used herein, the term “instructions”can include the address of an internet site that provides theinstructions.

Devices

Flow Cells

The microwell array substrate can be packaged within a flow cell thatprovides for convenient interfacing with the rest of the fluid handlingsystem and facilitates the exchange of fluids, e.g. cell and solidsupport suspensions, lysis buffers, rinse buffers, etc., that aredelivered to the microwell array and/or emulsion droplet. Designfeatures can include: (i) one or more inlet ports for introducing cellsamples, solid support suspensions, or other assay reagents, (ii) one ormore microwell array chambers designed to provide for uniform fillingand efficient fluid-exchange while minimizing back eddies or dead zones,and (iii) one or more outlet ports for delivery of fluids to a samplecollection point or a waste reservoir. The design of the flow cell caninclude a plurality of microarray chambers that interface with aplurality of microwell arrays such that one or more different cellsamples can be processed in parallel. The design of the flow cell canfurther include features for creating uniform flow velocity profiles,i.e. “plug flow”, across the width of the array chamber to provide formore uniform delivery of cells and beads to the microwells, for example,by using a porous barrier located near the chamber inlet and upstream ofthe microwell array as a “flow diffuser”, or by dividing each arraychamber into several subsections that collectively cover the same totalarray area, but through which the divided inlet fluid stream flows inparallel. In some embodiments, the flow cell can enclose or incorporatemore than one microwell array substrate. In some embodiments, theintegrated microwell array/flow cell assembly can constitute a fixedcomponent of the system. In some embodiments, the microwell array/flowcell assembly can be removable from the instrument.

In general, the dimensions of fluid channels and the array chamber(s) inflow cell designs will be optimized to (i) provide uniform delivery ofcells and beads to the microwell array, and (ii) to minimize sample andreagent consumption. In some embodiments, the width of fluid channelswill be between 50 um and 20 mm. In other embodiments, the width offluid channels can be at least 50 um, at least 100 um, at least 200 um,at least 300 um, at least 400 um, at least 500 um, at least 750 um, atleast 1 mm, at least 2.5 mm, at least 5 mm, at least 10 mm, at least 20mm, at least 50 mm, at least 100 mm, or at least 150 mm. In yet otherembodiments, the width of fluid channels can be at most 150 mm, at most100 mm, at most 50 mm, at most 20 mm, at most 10 mm, at most 5 mm, atmost 2.5 mm, at most 1 mm, at most 750 um, at most 500 um, at most 400um, at most 300 um, at most 200 um, at most 100 um, or at most 50 um. Inone embodiment, the width of fluid channels is about 2 mm. The width ofthe fluid channels can fall within any range bounded by any of thesevalues (e.g. from about 250 um to about 3 mm).

In some embodiments, the depth of the fluid channels will be between 50um and 2 mm. In other embodiments, the depth of fluid channels can be atleast 50 um, at least 100 um, at least 200 um, at least 300 um, at least400 um, at least 500 um, at least 750 um, at least 1 mm, at least 1.25mm, at least 1.5 mm, at least 1.75 mm, or at least 2 mm. In yet otherembodiments, the depth of fluid channels can at most 2 mm, at most 1.75mm, at most 1.5 mm, at most 1.25 mm, at most 1 mm, at most 750 um, atmost 500 um, at most 400 um, at most 300 um, at most 200 um, at most 100um, or at most 50 um. In one embodiment, the depth of the fluid channelsis about 1 mm. The depth of the fluid channels can fall within any rangebounded by any of these values (e.g. from about 800 um to about 1 mm).

Flow cells can be fabricated using a variety of techniques and materialsknown to those of skill in the art. In general, the flow cell will befabricated as a separate part and subsequently either mechanicallyclamped or permanently bonded to the microwell array substrate. Examplesof suitable fabrication techniques include conventional machining, CNCmachining, injection molding, 3D printing, alignment and lamination ofone or more layers of laser or die-cut polymer films, or any of a numberof microfabrication techniques such as photolithography and wet chemicaletching, dry etching, deep reactive ion etching, or lasermicromachining. Once the flow cell part has been fabricated it can beattached to the microwell array substrate mechanically, e.g. by clampingit against the microwell array substrate (with or without the use of agasket), or it can be bonded directly to the microwell array substrateusing any of a variety of techniques (depending on the choice ofmaterials used) known to those of skill in the art, for example, throughthe use of anodic bonding, thermal bonding, or any of a variety ofadhesives or adhesive films, including epoxy-based, acrylic-based,silicone-based, UV curable, polyurethane-based, or cyanoacrylate-basedadhesives.

Flow cells can be fabricated using a variety of materials known to thoseof skill in the art. In general, the choice of material used will dependon the choice of fabrication technique used, and vice versa. Examples ofsuitable materials include, but are not limited to, silicon,fused-silica, glass, any of a variety of polymers, e.g.polydimethylsiloxane (PDMS; elastomer), polymethylmethacrylate (PMMA),polycarbonate (PC), polypropylene (PP), polyethylene (PE), high densitypolyethylene (HDPE), polyimide, cyclic olefin polymers (COP), cyclicolefin copolymers (COL), polyethylene terephthalate (PET), epoxy resins,metals (e.g. aluminum, stainless steel, copper, nickel, chromium, andtitanium), a non-stick material such as teflon (PTFE), or a combinationof these materials.

Cartridges

In some embodiments of the system, the microwell array, with or withoutan attached flow cell, can be packaged within a consumable cartridgethat interfaces with the instrument system. Design features ofcartridges can include (i) one or more inlet ports for creating fluidconnections with the instrument or manually introducing cell samples,bead suspensions, or other assay reagents into the cartridge, (ii) oneor more bypass channels, i.e. for self-metering of cell samples and beadsuspensions, to avoid overfilling or back flow, (iii) one or moreintegrated microwell array/flow cell assemblies, or one or more chamberswithin which the microarray substrate(s) are positioned, (iv) integratedminiature pumps or other fluid actuation mechanisms for controllingfluid flow through the device, (v) integrated miniature valves (or othercontainment mechanisms) for compartmentalizing pre-loaded reagents (forexample, bead suspensions) or controlling fluid flow through the device,(vi) one or more vents for providing an escape path for trapped air,(vii) one or more sample and reagent waste reservoirs, (viii) one ormore outlet ports for creating fluid connections with the instrument orproviding a processed sample collection point, (ix) mechanical interfacefeatures for reproducibly positioning the removable, consumablecartridge with respect to the instrument system, and for providingaccess so that external magnets can be brought into close proximity withthe microwell array, (x) integrated temperature control components or athermal interface for providing good thermal contact with the instrumentsystem, and (xi) optical interface features, e.g. a transparent window,for use in optical interrogation of the microwell array.

The cartridge can be designed to process more than one sample inparallel. The cartridge can further comprise one or more removablesample collection chamber(s) that are suitable for interfacing withstand-alone PCR thermal cyclers or sequencing instruments. The cartridgeitself can be suitable for interfacing with stand-alone PCR thermalcyclers or sequencing instruments. The term “cartridge” as used in thisdisclosure can be meant to include any assembly of parts which containsthe sample and beads during performance of the assay.

The cartridge can further comprise components that are designed tocreate physical or chemical barriers that prevent diffusion of (orincrease path lengths and diffusion times for) large molecules in orderto minimize cross-contamination between microwells. Examples of suchbarriers can include, but are not limited to, a pattern of serpentinechannels used for delivery of cells and solid supports (e.g., beads) tothe microwell array, a retractable platen or deformable membrane that ispressed into contact with the surface of the microwell array substrateduring lysis or incubation steps, the use of larger beads, e.g. Sephadexbeads as described previously, to block the openings of the microwells,or the release of an immiscible, hydrophobic fluid from a reservoirwithin the cartridge during lysis or incubation steps, to effectivelyseparate and compartmentalize each microwell in the array.

The dimensions of fluid channels and the array chamber(s) in cartridgedesigns can be optimized to (i) provide uniform delivery of cells andbeads to the microwell array, and (ii) to minimize sample and reagentconsumption. The width of fluid channels can be between 50 micrometersand 20 mm. In other embodiments, the width of fluid channels can be atleast 50 micrometers, at least 100 micrometers, at least 200micrometers, at least 300 micrometers, at least 400 micrometers, atleast 500 micrometers, at least 750 micrometers, at least 1 mm, at least2.5 mm, at least 5 mm, at least 10 mm, or at least 20 mm. In yet otherembodiments, the width of fluid channels can at most 20 mm, at most 10mm, at most 5 mm, at most 2.5 mm, at most 1 mm, at most 750 micrometers,at most 500 micrometers, at most 400 micrometers, at most 300micrometers, at most 200 micrometers, at most 100 micrometers, or atmost 50 micrometers. The width of fluid channels can be about 2 mm. Thewidth of the fluid channels can fall within any range bounded by any ofthese values (e.g. from about 250 um to about 3 mm).

The fluid channels in the cartridge can have a depth. The depth of thefluid channels in cartridge designs can be between 50 micrometers and 2mm. The depth of fluid channels can be at least 50 micrometers, at least100 micrometers, at least 200 micrometers, at least 300 micrometers, atleast 400 micrometers, at least 500 micrometers, at least 750micrometers, at least 1 mm, at least 1.25 mm, at least 1.5 mm, at least1.75 mm, or at least 2 mm. The depth of fluid channels can at most 2 mm,at most 1.75 mm, at most 1.5 mm, at most 1.25 mm, at most 1 mm, at most750 micrometers, at most 500 micrometers, at most 400 micrometers, atmost 300 micrometers, at most 200 micrometers, at most 100 micrometers,or at most 50 micrometers. The depth of the fluid channels can be about1 mm. The depth of the fluid channels can fall within any range boundedby any of these values (e.g. from about 800 micrometers to about 1 mm).

Cartridges can be fabricated using a variety of techniques and materialsknown to those of skill in the art. In general, the cartridges will befabricated as a series of separate component parts (FIGS. 17A-C) andsubsequently assembled using any of a number of mechanical assemblies orbonding techniques. Examples of suitable fabrication techniques include,but are not limited to, conventional machining, CNC machining, injectionmolding, thermoforming, and 3D printing. Once the cartridge componentshave been fabricated they can be mechanically assembled using screws,clips, and the like, or permanently bonded using any of a variety oftechniques (depending on the choice of materials used), for example,through the use of thermal bonding/welding or any of a variety ofadhesives or adhesive films, including epoxy-based, acrylic-based,silicone-based, UV curable, polyurethane-based, or cyanoacrylate-basedadhesives.

Cartridge components can be fabricated using any of a number of suitablematerials, including but not limited to silicon, fused-silica, glass,any of a variety of polymers, e.g. polydimethylsiloxane (PDMS;elastomer), polymethylmethacrylate (PMMA), polycarbonate (PC),polypropylene (PP), polyethylene (PE), high density polyethylene (HDPE),polyimide, cyclic olefin polymers (COP), cyclic olefin copolymers (COL),polyethylene terephthalate (PET), epoxy resins, non-stick materials suchas teflon (PTFE), metals (e.g. aluminum, stainless steel, copper,nickel, chromium, and titanium), or any combination thereof.

The inlet and outlet features of the cartridge can be designed toprovide convenient and leak-proof fluid connections with the instrument,or can serve as open reservoirs for manual pipetting of samples andreagents into or out of the cartridge. Examples of convenient mechanicaldesigns for the inlet and outlet port connectors can include, but arenot limited to, threaded connectors, Luer lock connectors, Luer slip or“slip tip” connectors, press fit connectors, and the like. The inlet andoutlet ports of the cartridge can further comprise caps, spring-loadedcovers or closures, or polymer membranes that can be opened or puncturedwhen the cartridge is positioned in the instrument, and which serve toprevent contamination of internal cartridge surfaces during storage orwhich prevent fluids from spilling when the cartridge is removed fromthe instrument. The one or more outlet ports of the cartridge canfurther comprise a removable sample collection chamber that is suitablefor interfacing with stand-alone PCR thermal cyclers or sequencinginstruments.

The cartridge can include integrated miniature pumps or other fluidactuation mechanisms for control of fluid flow through the device.Examples of suitable miniature pumps or fluid actuation mechanisms caninclude, but are not limited to, electromechanically- orpneumatically-actuated miniature syringe or plunger mechanisms, membranediaphragm pumps actuated pneumatically or by an external piston,pneumatically-actuated reagent pouches or bladders, or electro-osmoticpumps.

The cartridge can include miniature valves for compartmentalizingpre-loaded reagents or controlling fluid flow through the device.Examples of suitable miniature valves can include, but are not limitedto, one-shot “valves” fabricated using wax or polymer plugs that can bemelted or dissolved, or polymer membranes that can be punctured; pinchvalves constructed using a deformable membrane and pneumatic, magnetic,electromagnetic, or electromechanical (solenoid) actuation, one-wayvalves constructed using deformable membrane flaps, and miniature gatevalves.

The cartridge can include vents for providing an escape path for trappedair. Vents can be constructed according to a variety of techniques, forexample, using a porous plug of polydimethylsiloxane (PDMS) or otherhydrophobic material that allows for capillary wicking of air but blockspenetration by water.

The mechanical interface features of the cartridge can provide foreasily removable but highly precise and repeatable positioning of thecartridge relative to the instrument system. Suitable mechanicalinterface features can include, but are not limited to, alignment pins,alignment guides, mechanical stops, and the like. The mechanical designfeatures can include relief features for bringing external apparatus,e.g. magnets or optical components, into close proximity with themicrowell array chamber (FIG. 17B).

The cartridge can also include temperature control components or thermalinterface features for mating to external temperature control modules.Examples of suitable temperature control elements can include, but arenot limited to, resistive heating elements, miniature infrared-emittinglight sources, Peltier heating or cooling devices, heat sinks,thermistors, thermocouples, and the like. Thermal interface features canbe fabricated from materials that are good thermal conductors (e.g.copper, gold, silver, etc.) and can comprise one or more flat surfacescapable of making good thermal contact with external heating blocks orcooling blocks.

The cartridge can include optical interface features for use in opticalimaging or spectroscopic interrogation of the microwell array. Thecartridge can include an optically transparent window, e.g. themicrowell substrate itself or the side of the flow cell or microarraychamber that is opposite the microwell array, fabricated from a materialthat meets the spectral requirements for the imaging or spectroscopictechnique used to probe the microwell array. Examples of suitableoptical window materials can include, but are not limited to, glass,fused-silica, polymethylmethacrylate (PMMA), polycarbonate (PC), cyclicolefin polymers (COP), or cyclic olefin copolymers (COL).

While preferred embodiments of the present invention have been shown anddescribed herein, it will be obvious to those skilled in the art thatsuch embodiments are provided by way of example only. Numerousvariations, changes, and substitutions will now occur to those skilledin the art without departing from the invention. It should be understoodthat various alternatives to the embodiments of the invention describedherein can be employed in practicing the invention. It is intended thatthe following claims define the scope of the invention and that methodsand structures within the scope of these claims and their equivalents becovered thereby.

EXAMPLES

Some aspects of the embodiments discussed above are disclosed in furtherdetail in the following examples, which are not in any way intended tolimit the scope of the present disclosure.

Example 1 Methods for Determining the Number of Distinct Targets atSpatial Locations in a Sample Using a Spatial Label

The disclosure provides for methods for determining the number ofdistinct targets and their distinct spatial locations in a sample usinga spatial label on a stochastic barcode of the disclosure. A tissuethin-slice is separated into sections. The sections are placed on asubstrate in a known way. The sections are placed on a substrate suchthat they preserve the physical order of the tissue section. Thesections are placed on a substrate such that they do not preserve thephysical order of the tissue section. The tissue section is contactedwith a plurality of solid supports. The tissue section is contacted witha plurality of solid supports in a known way such that a user knowswhich solid support with which spatial label contacted which section. Asingle solid support can contact each section of the tissue thin slice.The solid supports comprise a plurality of stochastic barcodes. Thestochastic barcodes comprise a universal label, a spatial label, acellular label, a molecular label, and a target-association region. Thetissue thin section is imaged with the solid supports. The imagecaptures the physical structure of the tissue thin slice and identifiesthe orientation of the solid supports associated with the tissue thinslice. For example, solid supports can be etched with an identifier thatcan be visible in the image. The sequence of the spatial label on eachof the etched solid supports is pre-known.

The targets in the section of the tissue thin slice associate with thetarget-association region (e.g., through their poly(A) tail). Thetargets are cross-linked to the target-association region. The solidsupports are removed from the tissue thin section. The targetsassociated with the target-association region are reverse transcribedusing a reverse transcriptase, thereby generating a target-barcodemolecule which is a transcript incorporating the labels of thestochastic barcode into its polynucleotide sequence. The target-barcodemolecule is amplified using polymerase chain reaction. The sequence ofthe target-barcode molecule is determined, for example, throughsequencing. The sequence reaction determines the spatial label, thecellular label, the molecular label, and some or all of the sequence ofthe target. The number of distinct targets are counted, wherein theunique occurrences of a specific molecular label indicate a distincttarget. The sequence of the spatial label is used to correlate thenumber of the distinct targets with a position in physical space of thetissue thin slice. A map is generated that displays the location andamount of a distinct target in the tissue thin section. The amount ofthe distinct target is displayed as a colorimetric intensity.

Example 2 Methods for Determining the Number of Distinct Target at aSpatial Location in a Sample Using a Timing Correlation

The disclosure provides for methods for determining the number ofdistinct targets and their distinct spatial locations in a sample usinga spatial label on a stochastic barcode of the disclosure and a timingcorrelation. A tissue thin-slice is separated into sections using adevice. The device places the sections on a substrate in a known way.The sections are placed on a substrate such that they preserve thephysical order of the tissue section. The sections are placed on asubstrate such that they do not preserve the physical order of thetissue section.

In some embodiments, a device takes biopsies of a solid tissue at agiven rate. The device places the biopsy samples on a substrate at agiven location. The location of the biopsy sample is related to the rateat which the device took the biopsy samples. This is related to the timethe device was in a specific location to take the biopsy sample.

In either case, the sample/section is contacted with a plurality ofsolid supports. The sample/section is contacted with a plurality ofsolid supports in a known way such that a user knows which solid supportwith which spatial label contacted which section. A single solid supportcan contact each section of the sample/section. The solid supportscomprise a plurality of stochastic barcodes. The stochastic barcodescomprise a universal label, a spatial label, a cellular label, amolecular label, and a target-association region. The sample/section isimaged with the solid supports. The image captures the physicalstructure of the tissue thin slice and identifies the orientation of thesolid supports associated with the tissue thin slice. For example, solidsupports can be etched with an identifier that can be visible in theimage. The sequence of the spatial label on each of the etched solidsupports is pre-known.

The targets in the section of the sample/section associate with thetarget-association region (e.g., through their poly(A) tail). Thetargets are cross-linked to the target-association region. The solidsupports are removed from the sample/section. The targets associatedwith the target-association region are reverse transcribed using areverse transcriptase, thereby generating a target-barcode moleculewhich is a transcript incorporating the labels of the stochastic barcodeinto its polynucleotide sequence. The target-barcode molecule isamplified using polymerase chain reaction. The sequence of thetarget-barcode molecule is determined, for example, through sequencing.The sequence reaction determines the spatial label, the cellular label,the molecular label, and some or all of the sequence of the target. Thenumber of distinct targets are counted, wherein the unique occurrencesof a specific molecular label indicate a distinct target. The sequenceof the spatial label is used to correlate the number of the distincttargets with a specific solid support, which is correlated with aspecific time at which the solid support was contacted to thesample/section. In this way, the position of the distinct targets inphysical space of the sample/section and be analyzed. A map is generatedthat displays the location and amount of a distinct target in thesample/section. The amount of the distinct target is displayed as acolorimetric intensity.

Example 3 Method for Determining the Number of Distinct Targets at aSpatial Location in a Sample Using Label Lithography

The disclosure provides for methods for determining the number ofdistinct targets and their distinct spatial locations in a sample usinglengths of spatial labels on a stochastic barcode of the disclosure. Atissue thin-slice is separated into sections. The sections are placed ona substrate in a known way. The sections are placed on a substrate suchthat they preserve the physical order of the tissue section. Thesections are placed on a substrate such that they do not preserve thephysical order of the tissue section. In some embodiments, the tissuethin slice is not separated into section. In some embodiments, thetissue thin-slice is left intact.

In either instance, the tissue is contacted with a plurality of solidsupports. A single solid support can contact each section of the tissuethin slice. The solid supports can comprise a pre-spatial label. Thepre-spatial label is attached to the solid support. The pre-spatiallabel comprises a cellular label, a molecular label, and atarget-association region. The pre-spatial label comprises anactivatable consensus sequence. The activatable consensus sequence isactivated to link to a spatial label block which comprises acorresponding activatable sequence. For example, the pre-spatial labelcomprises biotin and the spatial label block comprises avidin on one endand biotin on the other end. The spatial label block is a sequence ofnucleotides that when concatenated together forms a spatial label.

Concatenation of the spatial label occurs in a geometric manner suchthat discrete spatial label blocks are added to specific pre-spatiallabels at specific physical locations. Spatial label blocks are added inan increasing manner while moving from top to bottom across the tissuethin-slice. Spatial label blocks are then added in an increasing mannerwhile moving from left to right across the tissue thin-slice. In thisway, the length of the spatial label (i.e., comprising concatenatedspatial label blocks) is indicative of a physical location in the tissuesample.

The tissue thin section is imaged with the solid supports before theyhave been linked to spatial label blocks. The image captures thephysical structure of the tissue thin slice and identifies theorientation of the solid supports associated with the tissue thin slice.For example, solid supports can be etched with an identifier that can bevisible in the image. The sequence of the spatial label on each of theetched solid supports is pre-known.

The targets in the section of the tissue thin slice associate with thetarget-association region (e.g., through their poly(A) tail). Thetargets are cross-linked to the target-association region. The solidsupports are removed from the tissue thin section. The targetsassociated with the target-association region are reverse transcribedusing a reverse transcriptase, thereby generating a target-barcodemolecule which is a transcript incorporating the labels of thestochastic barcode into its polynucleotide sequence. The target-barcodemolecule is amplified using polymerase chain reaction. The sequence ofthe target-barcode molecule is determined, for example, throughsequencing. The sequence reaction determines the spatial label, thecellular label, the molecular label, and some or all of the sequence ofthe target. The number of distinct targets are counted, wherein theunique occurrences of a specific molecular label indicate a distincttarget. The length of the spatial label is used to correlate the numberof the distinct targets with a position in physical space of the tissuethin slice. A map is generated that displays the location and amount ofa distinct target in the tissue thin section. The amount of the distincttarget is displayed as a colorimetric intensity.

Example 4 Combinatorial Methods for Generating Large Libraries of UniqueSynthetic Particles with Both DNA Barcodes and Spectrally ResolvableBarcodes

This example demonstrates a combinatorial method to generate largelibraries of at least 96³ unique synthetic particles with both DNAbarcodes such as stochastic barcodes and spectrally resolvable barcodessuch as optical barcodes.

The method is used to encode synthetic particles. As shown in FIG. 18A,each synthetic particle can contain 9 “anchor regions.” Each anchorregion can have a size 1802 of about a few microns to tens of micronswide. The anchor regions can be arranged in a longitudinal format asshown in FIG. 18A, or a grid format as shown in FIG. 18B. The anchorregions can occupy the entire synthetic particle surface as shown inFIG. 18A, or part of the synthetic particle surface as shown in FIG.18B. Each anchor region can have a unique optical label (OL) attached tothe surface of the synthetic particle. FIGS. 18A-B show 9 opticallabels, OL1-9, attached to the surface of the synthetic particle. Theunique optical label can include an oligonucleotide sequence. The uniqueoptical label can include a unique optical moiety. The unique opticalmoiety can be a fluorophore. OL1-3 can be used to encode cellular labelpart 1 corresponding to a first 96 unique cellular labels in the firstencoding step. OL4-6 can be used to encode cellular label part 2corresponding to a second 96 unique cellular labels in the second splitstep. OL7-9 can be used to encode cellular label part 3 corresponding toa third 96 unique cellular labels in the third split step. In additionto the ‘optical label,’ the entire synthetic particle surface or part ofthe synthetic particle surface can be attached with universal sequence(US) with 3′ up, i.e. 3′ end of the oligonucleotide is not attached tothe synthetic particle.

At the first encoding step/the first split step, synthetic particles aredistributed across 96 wells of a first plate and hybridize tooligonucleotides in each well. FIG. 19 shows the hybridization ofoligonucleotides in the first encoding step. Each well can contain, forexample, 4 types of oligonucleotides. The first type of oligonucleotideseach can contain a region complementary to the universal sequence (US),followed by part 1 of the cellular label (1 of 96), followed by a linkersequence (linker 1). The second type of oligonucleotides each caninclude a duplex structure that contains a strand complementary to OL1with a small 5′ extension, and a shorter strand complementary to theextension on the longer strand. The shorter strand can include anoptical label with an optical moiety, for example a fluorophore, on the3′ end. The fluorophore can be one of five possibilities such as 5different fluorophores, or 4 different fluorophores and the possibilityof having no fluorophore. The third type of oligonucleotides each caninclude a duplex structure that contains a strand complementary to OL2with a small 5′ extension, and a shorter strand complementary to theextension on the longer strand. The shorter strand can include anoptical label with an optical moiety, for example a fluorophore, on the3′ end. The fluorophore can be one of five possibilities such as 5different fluorophores, or 4 different fluorophores and the possibilityof having no fluorophore. The fourth type of oligonucleotides each caninclude a duplex structure that contains a strand complementary to OL3with a small 5′ extension, and a shorter strand complementary to theextension on the longer strand. The shorter strand can include anoptical label with an optical moiety, for example a fluorophore, on the3′ end. The fluorophore can be one of five possibilities such as 5different fluorophores, or 4 different fluorophores and the possibilityof having fluorophore).

FIG. 20 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the first plate. The oligonucleotide content in each ofthe 96 wells in the first plate shows the correspondence of cellularlabel part 1 and encoding by OL1-3. The number of optical moieties in agroup of spectrally-distinct optical moieties, i.e. the number ofpossibilities of fluorophores, has to be sufficient to allow encoding ofat least the number of the unique synthetic particles. To encode 96unique synthetic particles, k^(n) has to be greater or equal to 96,where k is the number of optical moieties in the group ofspectrally-distinct optical moieties, n is the number of regions. Inthis example, n of OL1-3 is 3, n of OL4-6 is 3, and n of OL7-9 is 3.Thus k has to be at least 5, with k^(n)=5^3=125>96.

After the synthetic particles are distributed across 96 wells of thefirst plate and hybridize to oligonucleotides in each well, DNApolymerase and DNA ligase can be introduced into each well. DNApolymerase can extend the US sequence with cellular label part 2 andlinker 2 sequences. DNA ligase can covalently attach the fluorescentprobe onto the OL oligonucleotides. FIG. 21 shows the single strandedoligonucleotides in the various regions on the synthetic particles afterpolymerization, ligation, and denaturation of duplex DNA in the firstencoding step.

At the second encoding step including pool and second split, syntheticparticles from all the wells of the first plate can be pooled, and splitinto each of the 96 wells of a second plate. FIG. 22 shows thehybridization of oligonucleotides in the second encoding step. Each wellof the second plate can contain 4 types of oligonucleotides. Each wellcontains 4 types of oligonucleotides. The first type of oligonucleotideeach can include linker 1, followed by part 2 of the cellular label (1of 96), followed by another linker sequence (linker 2). The second typeof oligonucleotide each can include a duplex structure that contains astrand complementary to OL4 with a small 5′ extension, and a shorterstrand complementary to the extension on the longer strand. The shorterstrand can include an optical label with an optical moiety, for examplea fluorophore, on the 3′ end. The fluorophore can be one of fivepossibilities such as 5 different fluorophores, or 4 differentfluorophores and the possibility of no fluorophore. The third type ofoligonucleotide each can include a duplex structure that contains astrand complementary to OL5 with a small 5′ extension, and a shorterstrand complementary to the extension on the longer strand. The shorterstrand can include an optical label with an optical moiety, for examplea fluorophore, on the 3′ end. The fluorophore can be one of fivepossibilities such as 5 different fluorophores, or 4 differentfluorophores and the possibility of no fluorophore. The fourth type ofoligonucleotide each can include a duplex structure that contains astrand complementary to OL6 with a small 5′ extension, and a shorterstrand complementary to the extension on the longer strand. The shorterstrand can include an optical label with an optical moiety, for examplea fluorophore, on the 3′ end. The fluorophore can be one of fivepossibilities such as 5 different fluorophores, or 4 differentfluorophores and the possibility of no fluorophore.

FIG. 23 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the second plate. The oligonucleotide content in each ofthe 96 wells in the second plate shows the correspondence of cellularlabel part 2 and encoding by OL4-6.

After the synthetic particles are distributed across 96 wells of thesecond plate and hybridize to oligonucleotides in each well, DNApolymerase and DNA ligase can be introduced into each well. DNApolymerase can extend the US sequence with cellular label part 1 andlinker 1 sequences. DNA ligase can covalently attach the fluorescentprobe onto the OL oligonucleotides. FIG. 24 shows the single strandedoligonucleotides in the various regions on the synthetic particles afterpolymerization, ligation, and denaturation of duplex DNA in the secondencoding step.

At the third encoding step including pool and third split, syntheticparticles from all the wells of the second plate can be pooled, andsplit into each of the 96 wells of a third plate. FIG. 25 shows thehybridization of oligonucleotides in the third encoding step. Each wellof the third plate can contain 4 types of oligonucleotides. Each wellcontains 4 types of oligonucleotides. The first type of oligonucleotideeach can include linker 2, followed by part 3 of the cellular label (1of 96), followed by molecular index (randomers) and oligo(dA). Thesecond type of oligonucleotides each can include a duplex structure thatcontains a strand complementary to OL7 with a small 5′ extension, and ashorter strand complementary to the extension on the longer strand. Theshorter strand can include an optical label with an optical moiety, forexample a fluorophore, on the 3′ end. The fluorophore can be one of fivepossibilities such as 5 different fluorophores, or 4 fluorophores andthe possibility of no fluorophore. The third type of oligonucleotideeach can include a duplex structure that contains a strand complementaryto OL8 with a small 5′ extension, and a shorter strand complementary tothe extension on the longer strand. The shorter strand can include anoptical label with an optical moiety, for example a fluorophore, on the3′ end. The fluorophore can be one of five possibilities such as 5different fluorophores, or 4 fluorophores and the possibility of nofluorophore. The fourth type of oligonucleotides each can include aduplex structure that contains a strand complementary to OL9 with asmall 5′ extension, and a shorter strand complementary to the extensionon the longer strand. The shorter strand can include an optical labelwith an optical moiety, for example a fluorophore, on the 3′ end. Thefluorophore can be one of five possibilities such as 5 differentfluorophores, or 4 fluorophores and the possibility of no fluorophore.

FIG. 26 is a lookup table showing the oligonucleotide content in each ofthe 96 wells in the third plate. The oligonucleotide content in each ofthe 96 wells in the third plate shows the correspondence of cellularlabel part 2 and encoding by OL4-6.

After the synthetic particles are distributed across 96 wells of thethird plate and hybridize to oligonucleotides in each well, DNApolymerase and DNA ligase can be introduced into each well. DNApolymerase can extend the US sequence with cellular label part 1 andlinker 1 sequences. DNA ligase can covalently attach the fluorescentprobe onto the OL oligonucleotides. FIG. 27 shows the single strandedoligonucleotides in the various regions on the synthetic particles afterpolymerization, ligation, and denaturation of duplex DNA in the thirdencoding step.

FIG. 28 shows an entire synthetic particle coated with both DNA barcodesand the spectrally resolvable barcode. Each DNA barcode, such as astochastic barcode, can include a universal sequence, a cellular label,a molecular label, and an oligo(dT) region. The cellular label caninclude cellular label part 1, cellular label part 2, and cellular labelpart 3 separated by linker 1 and linker 2. Each resolvable barcode, suchas an optical barcode, can include OL1-9 and the accompanying opticalmoieties.

FIG. 29 shows an exemplary combination of the spectrally resolvablebarcode, OL1-9, of a synthetic particle. The synthetic particle can beattached with a unique combination of 9 optical moieties, such as 9fluorescence regions. The unique combination of optical moietiescorresponds to a unique cellular label sequence. The fluorescence ineach fluorescence region within the synthetic particle can be detectedusing fluorescent imaging and image analysis. The cellular labelattached with the synthetic particle can then be determined based on thethree tables in FIGS. 20, 23, and 26. The combination of the 9fluorescence regions correspond to cellular label 12, 70, 22.

Altogether, these data demonstrate the use split-pool method to encodesynthetic particles with both DNA labels and spectral barcodes togenerate large libraries of synthetic particles.

Example 5 Generation of Spatial Gene Expression Map of Tissue Slices

This example demonstrates the use of encoded synthetic particles fromExample 4 to generate spatial gene expression map of tissue slices.

First, the encoded synthetic particles are randomly sprinkled on aslide. Second, suspend the encoded synthetic particles. Upon drying,synthetic particles will be immobilized and form a non-overlappingmonolayer. Third, scan the slide under different fluorescent channels.Fourth, analyze the image to deduce the spectral signature of eachencoded synthetic particle, and deduce the cellular label identity usingthe lookup tables. Fifth, place a thin tissue section on top of theslide with the encoded synthetic particles. Sixth, place a piece offilter paper soaked with lysis buffer on top of the tissue section andapply pressure, and hold to allow cell lysis and mRNA hybridization.Seventh, layer cDNA synthesis reagents on top of the slide to carry outcDNA synthesis reaction. Eighth, layer PCR reagents on top of slide togenerate copies of the cDNA. Alternatively, encoded synthetic particlescan be retrieved at any step after mRNA hybridization, and thesubsequent reactions can be carried out with encoded synthetic particlesin tubes. Ninth, sequence PCR products to determine cellular label,molecule label, and gene identity. Tenth, map the molecules associatedwith each cell to the location on the slide. For each gene, obtain a 2Dpicture of the number of target molecules found at a specific location.

Altogether, these data demonstrate the use of encoded syntheticparticles to generate spatial gene expression map of tissue slices.

Example 6 Synthetic Particle Synthesis

This example demonstrates synthetic particle synthesis by stop flowlithography.

First, fabricate a microfluidic device (e.g. PDMS or NOA) with 9 inputports converging to a single channel, leading to 1 output port. Second,in each of the input port, feed in a mixture of: Poly(ethylene glycol)diacrylate PEGDA, photoinitiator, 5′ acrydite modified universalsequence (US) oligonucleotide, and 5′ acrydite modified OLoligonucleotide (OL1 oligonucleotide for input port 1, OL2oligonucleotide for input port 2, . . . , OL9 oligonucleotide for inputport 9). Third, apply pressure at each of the input ports. The 9 inputswill form 9 parallel streams under laminar flow regime. Fourth, expose aregion of the converged channel with UV through a photomask with theoutline of the shape of the synthetic particle. Upon UV exposure, PEGDAand acrydite oligonucleotides can crosslink to form a solid hydrogelsynthetic particle, with 9 regions each with a different OLoligonucleotide arranged side by side. The synthetic particles can becollected at the output port and used for the split-pool encodingprocess outlined in Example 4.

Altogether, these data demonstrate the use of synthetic particlesynthesis by stop flow lithography.

In at least some of the previously described embodiments, one or moreelements used in an embodiment can interchangeably be used in anotherembodiment unless such a replacement is not technically feasible. Itwill be appreciated by those skilled in the art that various otheromissions, additions and modifications may be made to the methods andstructures described above without departing from the scope of theclaimed subject matter. All such modifications and changes are intendedto fall within the scope of the subject matter, as defined by theappended claims.

With respect to the use of substantially any plural and/or singularterms herein, those having skill in the art can translate from theplural to the singular and/or from the singular to the plural as isappropriate to the context and/or application. The varioussingular/plural permutations may be expressly set forth herein for sakeof clarity.

It will be understood by those within the art that, in general, termsused herein, and especially in the appended claims (e.g., bodies of theappended claims) are generally intended as “open” terms (e.g., the term“including” should be interpreted as “including but not limited to,” theterm “having” should be interpreted as “having at least,” the term“includes” should be interpreted as “includes but is not limited to,”etc.). It will be further understood by those within the art that if aspecific number of an introduced claim recitation is intended, such anintent will be explicitly recited in the claim, and in the absence ofsuch recitation no such intent is present. For example, as an aid tounderstanding, the following appended claims may contain usage of theintroductory phrases “at least one” and “one or more” to introduce claimrecitations. However, the use of such phrases should not be construed toimply that the introduction of a claim recitation by the indefinitearticles “a” or “an” limits any particular claim containing suchintroduced claim recitation to embodiments containing only one suchrecitation, even when the same claim includes the introductory phrases“one or more” or “at least one” and indefinite articles such as “a” or“an” (e.g., “a” and/or “an” should be interpreted to mean “at least one”or “one or more”); the same holds true for the use of definite articlesused to introduce claim recitations. In addition, even if a specificnumber of an introduced claim recitation is explicitly recited, thoseskilled in the art will recognize that such recitation should beinterpreted to mean at least the recited number (e.g., the barerecitation of “two recitations,” without other modifiers, means at leasttwo recitations, or two or more recitations). Furthermore, in thoseinstances where a convention analogous to “at least one of A, B, and C,etc.” is used, in general such a construction is intended in the senseone having skill in the art would understand the convention (e.g., “asystem having at least one of A, B, and C” would include but not belimited to systems that have A alone, B alone, C alone, A and Btogether, A and C together, B and C together, and/or A, B, and Ctogether, etc.). In those instances where a convention analogous to “atleast one of A, B, or C, etc.” is used, in general such a constructionis intended in the sense one having skill in the art would understandthe convention (e.g., “a system having at least one of A, B, or C” wouldinclude but not be limited to systems that have A alone, B alone, Calone, A and B together, A and C together, B and C together, and/or A,B, and C together, etc.). It will be further understood by those withinthe art that virtually any disjunctive word and/or phrase presenting twoor more alternative terms, whether in the description, claims, ordrawings, should be understood to contemplate the possibilities ofincluding one of the terms, either of the terms, or both terms. Forexample, the phrase “A or B” will be understood to include thepossibilities of “A” or “B” or “A and B.”

In addition, where features or aspects of the disclosure are describedin terms of Markush groups, those skilled in the art will recognize thatthe disclosure is also thereby described in terms of any individualmember or subgroup of members of the Markush group.

As will be understood by one skilled in the art, for any and allpurposes, such as in terms of providing a written description, allranges disclosed herein also encompass any and all possible sub-rangesand combinations of sub-ranges thereof. Any listed range can be easilyrecognized as sufficiently describing and enabling the same range beingbroken down into at least equal halves, thirds, quarters, fifths,tenths, etc. As a non-limiting example, each range discussed herein canbe readily broken down into a lower third, middle third and upper third,etc. As will also be understood by one skilled in the art all languagesuch as “up to,” “at least,” “greater than,” “less than,” and the likeinclude the number recited and refer to ranges which can be subsequentlybroken down into sub-ranges as discussed above. Finally, as will beunderstood by one skilled in the art, a range includes each individualmember. Thus, for example, a group having 1-3 articles refers to groupshaving 1, 2, or 3 articles. Similarly, a group having 1-5 articlesrefers to groups having 1, 2, 3, 4, or 5 articles, and so forth.

While various aspects and embodiments have been disclosed herein, otheraspects and embodiments will be apparent to those skilled in the art.The various aspects and embodiments disclosed herein are for purposes ofillustration and are not intended to be limiting, with the true scopeand spirit being indicated by the following claims.

What is claimed is:
 1. A method for determining spatial locations of a plurality of single cells, comprising: stochastically barcoding the plurality of single cells using a plurality of synthetic particles, wherein each of the plurality of synthetic particles comprises a plurality of stochastic barcodes, a first group of optical labels, and a second group of optical labels, wherein each of the plurality of stochastic barcodes comprises a cellular label and a molecular label, wherein each optical label in the first group of optical labels comprises a first optical moiety and each optical label in the second group of optical labels comprises a second optical moiety, and wherein each of the plurality of synthetic particles is associated with an optical barcode comprising the first optical moiety and the second optical moiety; detecting the optical barcode of each of the plurality of synthetic particles to determine the location of each of the plurality of synthetic particles; and determining the spatial locations of the plurality of single cells based on the locations of the plurality of synthetic particles.
 2. The method of claim 1, wherein the first optical moiety and the second optical moiety are selected from a group comprising consisting of two or more spectrally-distinct optical moieties.
 3. The method of claim 1, wherein stochastically barcoding the plurality of single cells using the plurality of synthetic particles comprises contacting the plurality of single cells with the plurality of synthetic particles.
 4. The method of claim 3, wherein a synthetic particle of the plurality of synthetic particles is in close proximity to a single cell or a small number of cells.
 5. The method of claim 3, wherein each of the plurality of single cells comprises a plurality of targets, wherein stochastically barcoding the plurality of single cells further comprises hybridizing the plurality of stochastic barcodes with the plurality of targets to generate stochastically barcoded targets, and wherein at least one of the plurality of targets is hybridized to one of the plurality of stochastic barcodes.
 6. The method of claim 1, wherein cellular labels of at least two stochastic barcodes of the plurality of stochastic barcodes on one synthetic particle have the same sequence, and wherein cellular labels of at least two stochastic barcodes of the plurality of stochastic barcodes on different synthetic particles have different sequences.
 7. The method of claim 1, wherein molecular labels of at least two stochastic barcodes of the plurality of stochastic barcodes on one synthetic particle have different sequences.
 8. The method of claim 1, wherein the molecular labels are selected from a group comprising consisting of at least 100 molecular labels with unique sequences.
 9. The method of claim 1, wherein the molecular labels are selected from a group comprising consisting of at least 1000 molecular labels with unique sequences.
 10. The method of claim 1, wherein detecting the optical barcode of each of the plurality of synthetic particles to determine the location of each of the plurality of synthetic particles comprises generating an optical image showing the optical barcodes and the locations of the plurality of synthetic particles.
 11. The method of claim 1, wherein the plurality of single cells comprises cells distributed across a microwell array comprising microwells.
 12. The method of claim 11, wherein each of the plurality of single cells comprises a plurality of targets, the method comprising: lysing the plurality of single cells; and generating an indexed library of stochastically barcoded targets, wherein generating an indexed library of stochastically barcoded targets comprises hybridizing the plurality of stochastic barcodes with the plurality of targets to generate stochastically barcoded targets, wherein the molecular label comprises a molecular label sequence, wherein the cellular label comprises a cellular label sequence, and wherein each of the stochastically barcoded targets comprises a cellular label sequence, a molecular label sequence, and at least a portion of the a complementary sequence of one of the plurality of targets.
 13. The method of claim 12, comprising: amplifying the stochastically barcoded targets of the indexed library to generate amplified stochastically barcoded targets; and sequencing the amplified stochastically barcoded targets to determine the number of amplified stochastically barcoded targets with unique molecular label sequences and identical complementary sequence, wherein the number of amplified stochastically barcoded targets with unique molecular label sequences and identical complementary sequence is substantially the same as the occurrences of targets with sequences complementary of the identical complementary sequence in the single cell or the small number of cells.
 14. The method of claim 13, wherein the labeled target molecules are amplified using bridge amplification, amplification with a gene specific primer, amplification with a universal primer, amplification with an oligo(dT) primer, or any combination thereof.
 15. The method of claim 1, wherein the plurality of single cells comprises a tissue, a cell monolayer, fixed cells, a tissue section, or any combination thereof.
 16. The method of claim 1, wherein a synthetic particle of the plurality of synthetic particle particles is a bead.
 17. The method of claim 16, wherein the bead is selected from the group comprising consisting of streptavidin beads, agarose beads, magnetic beads, conjugated beads, protein A conjugated beads, protein G conjugated beads, protein A/G conjugated beads, protein L conjugated beads, oligo(dT) conjugated beads, silica beads, silica-like beads, anti-biotin microbead, anti-fluorochrome microbead, and any combination thereof.
 18. The method of claim 1, wherein a synthetic particle of the plurality of synthetic particles comprises a material selected from the group comprising consisting of polydimethylsiloxane (PDMS), polystyrene, glass, polypropylene, agarose, hydrogel, paramagnetic, ceramic, plastic, glass, methylstyrene, acrylic polymer, titanium, latex, sepharose, cellulose, nylon, silicone, and any combination thereof.
 19. A synthetic particle composition, comprising: a synthetic particle; a plurality of stochastic barcodes, wherein each of the plurality of stochastic barcodes comprises a cellular label and a molecular label; a first group of optical labels; and a second group of optical labels, wherein the plurality of stochastic barcodes, the first group of optical labels, and the second group of optical labels are attached to the surface of the synthetic particle, wherein each optical label in the first group of optical labels comprises a first optical moiety and each optical label in the second group of optical labels comprises a second optical moiety, and wherein each of the plurality of synthetic particles the synthetic particle is associated with an optical barcode comprising the first optical moiety and the second optical moiety.
 20. The synthetic particle composition of claim 19, wherein the molecular labels of the plurality of stochastic barcodes are different from one another, and the molecular labels are selected from a group comprising consisting of at least 100 molecular labels with unique sequences.
 21. The synthetic particle composition of claim 19, wherein cellular labels of at least two stochastic barcodes of the plurality of stochastic barcodes have the same sequence.
 22. The synthetic particle composition of claim 19, wherein molecular labels of at least two stochastic barcodes of the plurality of stochastic barcodes have different sequences.
 23. The synthetic particle composition of claim 19, wherein molecular labels of the plurality of stochastic barcodes are selected from a group comprising consisting of at least 100 molecular labels with unique sequences.
 24. The synthetic particle composition of claim 19, wherein molecular labels of the plurality of stochastic barcodes are selected from a group comprising consisting of at least 1000 molecular labels with unique sequences.
 25. The synthetic particle composition of claim 19, wherein the first optical moiety and the second optical moiety are selected from a group comprising consisting two or more spectrally-distinct optical moieties.
 26. The synthetic particle composition of claim 19, wherein each of the plurality of stochastic barcodes comprises a spatial label, and wherein spatial labels of at least two stochastic barcodes of the plurality of stochastic barcodes differ from each other by at least one nucleotide.
 27. The synthetic particle composition of claim 19, wherein each of the plurality of stochastic barcodes further comprises a universal label, and wherein universal labels of at least two stochastic barcodes of the plurality of stochastic barcodes have the same sequence.
 28. The synthetic particle composition of claim 19, wherein the synthetic particle is a bead.
 29. The synthetic particle composition of claim 28, wherein the bead is selected from the group comprising consisting of streptavidin beads, agarose beads, magnetic beads, conjugated beads, protein A conjugated beads, protein G conjugated beads, protein A/G conjugated beads, protein L conjugated beads, oligo(dT) conjugated beads, silica beads, silica-like beads, anti-biotin microbead, anti-fluorochrome microbead, and any combination thereof.
 30. The synthetic particle composition of claim 19, wherein the synthetic particle comprises a material selected from the group comprising consisting of polydimethylsiloxane (PDMS), polystyrene, glass, polypropylene, agarose, hydrogel, paramagnetic, ceramic, plastic, glass, methylstyrene, acrylic polymer, titanium, latex, sepharose, cellulose, nylon, silicone, and any combination thereof.
 31. A method for determining the number and spatial locations of a plurality of targets in a sample, comprising: providing a solid support comprising a plurality of synthetic particles associated with a plurality of stochastic barcodes, wherein each of the plurality of stochastic barcodes comprises a spatial label and a molecular label; decoding the solid support by contacting the solid support with a plurality of decoding nucleic acids labeled with a decoding label and detecting the presence of the decoding label, wherein at least some portion of each of the plurality of stochastic barcodes is single stranded to allow hybridization to a decoding nucleic acid, and wherein decoding comprises two or more sequential hybridizations of decoding nucleic acids to each of the plurality of stochastic barcodes; stochastically barcoding the plurality of targets in the sample by hybridizing the plurality of stochastic barcodes with the plurality of targets to generate stochastically barcoded targets; identifying the spatial location of each of the plurality of targets by correlating the spatial labels of the plurality of the stochastic barcodes with the spatial locations of the plurality of targets in the sample; and estimating the number of each of the plurality of targets by determining sequences of the spatial labels and molecular labels of the plurality of the stochastic labels and counting the number of the molecular labels with distinct sequences.
 32. The method of claim 31, wherein the sample comprises a plurality of cells, and wherein the plurality of targets is associated with the plurality of cells.
 33. The method of claim 31, wherein the sample comprises a tissue, a cell monolayer, fixed cells, a tissue section, or any combination thereof.
 34. The method of claim 31, wherein the sample is physically divided during stochastically barcoding the plurality of targets in the sample.
 35. The method of claim 31, wherein the spatial locations of the plurality of targets in the sample are on a surface of the sample, inside the sample, subcellularly in the sample, or any combination thereof.
 36. The method of claim 31, wherein stochastic barcoding the plurality of targets in the sample is performed on the surface of the sample, subcellularly in the sample, inside the sample, or any combination thereof.
 37. The method of claim 31, wherein the plurality of targets comprises ribonucleic acids (RNAs), messenger RNAs (mRNAs), microRNAs, small interfering RNAs (siRNAs), RNA degradation products, RNAs each comprising a poly(A) tail, and any combination thereof.
 38. The method of claim 31, further comprising visualizing the plurality of targets in the sample.
 39. The method of claim 38, wherein visualizing the plurality of targets in the sample comprises mapping the plurality of targets onto a map of the sample.
 40. The method of claim 31, wherein the synthetic particles are beads.
 41. The method of claim 40, wherein the beads are silica gel beads, controlled pore glass beads, magnetic beads, dynabeads, sephadex/sepharose beads, cellulose beads, polystyrene beads, hydrogel beads, or any combination thereof.
 42. The method of claim 31, wherein the molecular labels of different stochastic barcodes are different from one another.
 43. The method of claim 31, wherein the sample is intact during stochastically barcoding the plurality of targets in the sample. 